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We study high-dimensional sample covariance matrices based on independent random vectors 
with missing coordinates. The presence of missing observations is common in modern applica¬ 
tions such as climate studies or gene expression micro-arrays. A weak approximation on the 
spectral distribution in the ’’large dimension d and large sample size n” asymptotics is derived 
for possibly different observation probabilities in the coordinates. The spectral distribution turns 
out to be strongly influenced by the missingness mechanism. In the null case under the miss¬ 
ing at random scenario where each component is observed with the same probability p, the 
limiting spectral distribution is a Marcenko-Pastur law shifted by {1 — p)/p to the left. As 
d/n —>■ 1 / G (0,1), the almost sure convergence of the extremal eigenvalues to the respective 
boundary points of the support of the limiting spectral distribution is proved, which are explic¬ 
itly given in terms of y and p. Eventually, the sample covariance matrix is positive definite if p 
is larger than 

1 - (1 - 

whereas this is not true any longer if p is smaller than this quantity. 

Keywords: Sample covariance matrix with missing observations, limiting spectral distribution, 
Stieltjes transform, almost sure convergence of extremal eigenvalues, characterization of positive 
definiteness. 


1. Introduction 

In many modern applications high-dimensional data suffers from missing observations. 
As pointed out in Troyanskaya et al. (2001), “The data from microarray experiments is 
usually in the form of large matrices of expression levels of genes (rows) under differ¬ 
ent experimental conditions (columns) and frequently with some values missing. Missing 
values occur for diverse reasons, including insufficient resolution, image corruption, or 
simply due to dust or scratches on the slide. Missing data may also occur systematically 
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as a result of the robotic methods used to create them.” “Data available for climate 
research typically suffer from uneven sampling due to ... sporadic instrument failure; or 
other interruptions during the period of interest,” Sherwood (2001). Further, missing ob¬ 
servations in telescope data may be caused by a cloudy sky, Nishizawa and Inoue (2013). 
In the statistical literature, high-dimensional low-rank covariance matrix estimation with 
missing observations has been recently investigated in Lounici (2014), where sparsity or¬ 
acle inequalities for a matrix-Lasso estimator are derived. An adaptive test for large 
covariance matrices with missing observations have been proposed recently in Butucea 
and Zgheib (2016). While in view of inference statements asymptotic properties of the 
eigenvalues and eigenvectors for high-dimensional sample covariance matrices based on 
complete data are exhaustively investigated in random matrix theory, the statistically 
equally important case of missing observations has not been studied so far. Concerning 
spectral based dimension reduction techniques and statistics such as the log-determinant, 
a profound spectral analysis is inevitable. The aim of this article is to get this develop¬ 
ment underway. We study asymptotic spectral properties of high-dimensional sample 
covariance matrices with missing observations. Let 



be a sample of independent identically distributed (iid) random vectors with covariance 
matrix 


T = E((Yi - EYi) (g) (Fi - EYi)). 


In examples as described above, we do not observe the whole random vector Yk but some 
of its components. This missingness is represented by a random matrix e G with 

entries 



I if Yik is observed 
0 if Yik is missing. 


Under the assumption that the matrices Y and e are independent, the estimator 



is the analogue of the sample covariance and hence the natural estimator for Tij, where 



(l.I) 


and 



Subsequently, f = {Tij) G is referred to as sample covariance matrix with missing 

observations. If EYfe = 0 is known in advance one typically uses the estimator 
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In what follows we write S for T and S if a statement holds for both estimators. The 
distribution of the missingness matrix e substantially influences the spectrum of S (see 
Figure 1). In the high-dimensional scenario, S may be asymptotically indefinite even if 
the smallest eigenvalue of T stays uniformly bounded away from zero. Heuristically, it is 
not clear at all how the high dimensionality affects the spectral properties in the situation 
of missing observations, and whether well-known phenomena occur in a possibly modified 
way. In this article we investigate asymptotic spectral properties of S under the classical 
missing (completely) at random (MAR) setting. Here, the variables Sik, i = 



O 

O 

d 
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Figure 1. The left column shows histograms of the eigenvalues of the estimator S and the right column 
of the estimator T from a centered Gaussian sample. The underlying population covariance matrix in 
each histogram is the identity. The dimension of the observations in the first row is 2000, the sample size 
8000 and all coordinates are observed. In the second row each coordinate is observed with probability 
1/2. In the last row the probabilities of observation are changed to 1/4 for the first 1000 coordinates and 
to 3/4 for the other half of the coordinates. 

k = 1, are independent random Bernoulli variables with 

^isik = 1) = p* and P(£ifc = 0) = I - Pi, 
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and they are jointly independent of Yi,..., The latter are assumed to be of the form 

Yk = T^^^Xk+EYk, k = l,...,n, 

where Xi ,..., are iid centered random vectors with independent coordinates of vari¬ 
ance 1. This representation is common in literature on random matrix theory. Without 
missing observations, that is, for completely observed random vectors Yi,.. .Y^, the clas¬ 
sical sample covariance matrix is a well-studied object in the large dimension d and 
large sample size n asymptotics. The first result on its spectral distribution is due to 
Marcenko and Pastur (1967). They established in particular weak convergence in proba¬ 
bility of the empirical spectral distribution for diagonal T under the assumption of finite 
fourth moment on the entries of Xi,... ,X„ and some dependency condition reflected in 
their mixed second and fourth moments. The most general version of this statement has 
been proved in Silverstein (1995), where weak convergence (almost surely) is established 
under the finite second moment assumption for rather general matrices T. The almost 
sure convergence of the largest eigenvalue in the null case T = Id-^d (identity matrix) has 
been proved in Yin, Bai and Krishnaiah (1988) under the assumption of the existence of 
the fourth moment, which generalizes a first result in this direction due to Geman (1980). 
Bai, Silverstein and Yin (1988) have shown that the existence of the fourth moment is 
in fact necessary. As concerns the smallest eigenvalue in the null case, the most current 
theorem on its almost sure convergence has been derived by Bai and Yin (1993). Under 
quite general regularity conditions on T, the convergence of the extremal eigenvalues to 
the respective boundaries of the support of the limiting spectral distribution follows from 
Bai and Silverstein (1998). 

Our contributions in this article are the following. 

(i) We establish a weak approximation of the empirical spectral distribution of the 
sample covariance matrix with missing observations S by a non-random sequence 
of probability measures expressed in terms of their Stieltjes transforms, which holds 
true for possibly different observation probabilities in the coordinates. In the null 
case under the missing at random scenario where each component is observed with 
the same probability p, the limiting spectral distribution is shown to be a Marcenko- 
Pastur law shifted by (1 — p)/p to the left. 

(ii) As d/n —>■ j/ G (0,1) and under the missing at random scenario where each com¬ 
ponent is observed with the same probability, we prove almost sure convergence 
of the extremal eigenvalues of E to the respective boundary points of the support 
of the limiting spectral distribution in the null case. A statistically important con¬ 
sequence is the characterization of positive definiteness for the sample covariance 
matrix with missing observations. 

Understanding the empirical spectral distribution of sample covariance matrices with 
missing observations is of great importance to develop improved estimators for the pop¬ 
ulation covariance matrix and the precision matrix. Such estimators have been already 
established for completely observed data by El Karoui (2008) and Ledoit and Wolf (2012) 
based on non-linear shrinkage of the eigenvalues. However, if some data is missing, the 
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situation is more intricate since the analysis in our article reveals that the limiting be¬ 
havior of the empirical spectral distribution does not only depend on the eigenvalues of 
the population covariance matrix but also on its eigenvectors. Nevertheless, we expect 
that adjusting the diagonal of the sample covariance matrix with missing observations 
yields a more suitable matrix for spectrum estimation. 

Very recently, various authors studied asymptotic spectral properties of sample autoco¬ 
variance matrices of high-dimensional time series which is another statistically relevant 
scenario. Jin et al. (2014) derived the limiting spectral distribution of the symmetrized 
autocovariance matrix in the iid case. Liu, Aue and Paul (2015) established a Marcenko- 
Pastur-type law for the empirical spectral distribution in case of general high-dimensional 
linear time series. They investigated the moderately high-dimensional case of this prob¬ 
lem in Wang, Aue and Paul (2015). Li, Pan and Yao (2015) developed the limiting 
singular value distribution of the sample autocovariance matrix by means of the Stieltjes 
transform for an independent sequence with elements possessing finite fourth moments. 
Wang and Yao (2015) proved the same result by the method of moments, and addition¬ 
ally the almost sure convergence of the spectral norm. The strong limit of the extreme 
eigenvalues of symmetrized autocovariance matrices is established in Wang et al. (2015). 
The article is organized as follows. First we introduce the essential notation and the 
model assumptions in the next section. Section 3 is devoted to our main results. The 
proof of Theorem 3.1 is quite long and therefore decomposed into Section 4, Section 5 
and Appendix A. The proof of Theorem 3.3 is deferred to Section 6 and Appendix B. 
Some auxiliary results which are used throughout the proofs are collected in Appendix 


C. 


2. Notation and preliminaries 


2.1. Notation 


For any bounded function / : M —>■ K 


ll/ll = sup|/(a;)| 


denotes its supremum norm. If / is Lipschitz in addition then the bounded Lipschitz 
norm is defined as 


WJWbl =max(||/||L, ll/ll). 


where ||/||l denotes is the best Lipschitz constant of /. We write 

C+ = {zGC: 3(z) > 0} 


for the upper complex half plane. For any Hermitian matrix A G denote the (nor¬ 

malized) spectral measure by 
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where Ai(>l) > ... > Ad(^) are the eigenvalues of A and Sx denotes the Dirac measure in 
X. If it is clear that we refer to a matrix A, we use the shortened notation Ai > ... > Xd- 
We write A* for the adjoint of A. Let us introduce the Schatten norms for matrices 


i/p 


\\A\\s,= (j2XMA*r^^ 


p> 1. 


Furthermore, tr(A) denotes the trace of A and rank(A) its rank. For two matrices A,B £ 
-^dxn write Ao B = {AikBik)i^k for the Hadamard product. For any vector v G 
diag(n) € is the diagonal matrix with the i-th diagonal entry equal to Vi. With 

slight abuse of notation we also write diag(kl) for did,g{An,Add), A G The 

Stieltjes transform of a measure /i on the real line is defined by 

m/,(z) = [ -^d/r(A), z € C+. 

Jr ~ z 


On the space of probability measures on K recall the following distance measures 


Kolmogorov metric: dK(,p,v) = ’]) ~ *^((“ 00 , •])|j, 

Dual bounded Lipschitz metric: dsLip,^) = sup / fd{fi — ix), 

\\f\\BL<lJR 

Levy metric: 

dLifJ-, v) = inf |e > 0 /r((—oo, a; — e]) — e < z^((—oo, a:]) 

< /r((—oo, a: + e]) + e for all x G k|. 


We will frequently make use of the well-known relation c?L(/i, v) < dxip, v) for any two 
probability measures p and v on the real line, cf. Petrov (1995), p. 43. For any measures 
p and V, fi-kiy denotes their convolution. As usual, stands for weak convergence. The 
Marcenko-Pastur distribution with parameters y,a^ > 0 is given by 




MP 



1 y'{b-x){x 

‘I'Ko'^ yx 


a) 


l{a < X < b}dx 


( 2 . 1 ) 


with a = cr^(l — ^/y)^ and b — a^(l -b ^/y)^■ Moreover, for > 0 let The 

notation < means less or equal up to some positive multiplicative constant which does 
not depend on the variable parameters in the expression. 


2.2. Preliminaries 

Let (X(i, k})i^keN be a double array of iid centered random variables with unit vari¬ 
ance. The left upper d x n submatrix is denoted by Xd,n- Then the random vectors 
Yi.d,n, • ■ •) Yn,d,n S are the columns of the matrix 

Yd,n - EYd,n = T^!^Xd,n- 
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with 



This structure on the population covariance matrix is the simplest one which allows to 
visualize the effects of missing observations on the spectrum of the sample covariance 
matrix. The non-diagonal case is discussed at the end of Section 4. Its treatment requires 
some technical modification of the arguments presented here but not substantially new 
ideas and is beyond the scope of the article. (ed,n)d,n is a triangular array of random ma¬ 
trices Sd,n € independent of {X{i, k))i^kGN, where the entries eik,d,n are independent 

Bernoulli variables with observation probabilities 


— 1) — Pi,d,n: I — 1, . . . , d, k — 1, ... 77.. 


The dependence of the set J\fij and the number Nij in (1.1) on the sequence (e^.n) is 
indicated by an additional subscript d, n. Throughout this article we impose that the 
family of spectral measures of the population covariance matrices {Td^n) as well as the 
family of empirical distributions 



are tight. This assumption ensures that there are not too many probabilities of obser¬ 
vation Pi^d,n in the vector that are very close to zero, in the sense that for most 
coordinates i = the number of observations remains proportional to n, while a 

few degeneracies may occur. Asymptotic statements refer to 


d —>■ oo while n = n{d) satisfies limsup (d/n) < oo. 


( 2 . 2 ) 


The sequence of sample covariance matrices with missing observations is denoted by 



the corresponding sequence of spectral measures by (^d,n)d,n and their Stieltjes trans¬ 
forms by (777d,n)d,n- 

3. Results 

The main results of the article are the weak approximation of the spectral measure /id,n 
of Sd,n by a non-random sequence of probability measures, and, in the null case, the 
almost sure convergence of the extremal eigenvalues of Sd,n- Thereto, define the matrices 







Theorem 3.1. Suppose that the assumptions stated in Subsection 2.2 hold, and 

sup < CXO. 

d 


Then for any z € C^, 
where ^{z) satisfies 


\md,n{z) - m°^ Jz)\^ 0 a.S; 




dot Sd,n Z^^dxd 


+ n^°d,n(^) 

and „ is the (unique) solution of the fixed point equation 




Sd,n ^didxc 


(3.1) 


Moreover, m°^ ^ is the Stieltjes transform of a probability measure ^ on the real line 
and 


Td,n h-d.n 


Remark. Note that the theorem covers in particular the ease d/n —> 0. It follows from 
the proof that 


Kuiz)] < 


3(z) 


z G C+. 


Due to Rd,n — Sd,n = Td^n, this implies that the Stieltjes transforms „ approach those of 
the spectral measures ofTd^n as d/n ^ 0. That is, an effect caused by missing observations 
appears asymptotically only in the high-dimensional scenario livavaidd/n > 0. 


The equation (3.1) characterizes uniquely the approximating spectral measure via its 
Stieltjes transform. Without missing observation, i.e. Pi^d,n = 1, the solution of (3.1) 
coincides with the solution to the Marcenko-Pastur equation 






T^^,d,u w2.„(^)) - ^ 


The difference in the representation results from the fact that the spectra of 
Tl!nXd,nX*d,nTy,^ and 


are identical up to \d— n\ zero eigenvalues, which is used in the classical analysis. Except 
for special cases, this simplification is not possible in the missing at random scenario. 
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It is well-known that the Stieltjes transform of the Marcenko-Pastur law with parameters 
{y,o'‘^/po) is the unique solution to 


s{z) 


(po l + ^ys{z) 


from C+ —>■ C+. In the special case = cr^Idxd and pd,n = {po, ■ ■ ■ ,Po) G (0,1)'^, we 
have 

1 -Po \ ^ ^_1_ 

Hence, „ is the Marcenko-Pastur law ^^^^2 shifted by to the left. 

n ’Po" 




Corollary 3.2. Grant the conditions of Theorem 3.1. If Pi^d,n = Po > 0 for i = 1,... ,d 
and d, n S N and Td,n = cr^Idxd, > 0, we obtain 

, MP £■ 

h’d.n - r fj. ^ ^ 0 i-pn ^9. a.S. 

V’PO PO 

as d —)■ 00 and d/n —)■ y > 0. Eventually, as y < 1, 

limsup Amin < 0 a.S. if po < 1 - (1 - v^)^- 


In other words, under the missing at random scenario where each component is observed 
with the same probability pq, the limiting spectral distribution is a Marcenko-Pastur 
law shifted by cr^(l — po)/po to the left. Eventually, the sample covariance matrix is not 
positive definite if po is smaller than 

i-a-Vyf- 

For the estimator 'Ed.n we even determine the almost sure limit of the extremal eigenval¬ 
ues. 


Theorem 3.3. Grant the conditions of Gorollary 3.2 let additionally ^Xfi < oo and 
£d,n G be the upper left corner of a double array {e{i,k))i^k£'H of iid Bernoulli 

variables with parameter po. Assume that EYd^n = O.Then, i/0 < y < 1, 


lim Amin 
d—>oo 


lim Amax 

d—>-oo 




-a-Vvf 

Po 

-a + Vvf 

Po 


1-PO 2 
- (T 

Po 


a.5., 


l-PO 2 
- a 

Po 


a.S. 


and 
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The limit of the smallest eigenvalue is always smaller than in the completely observed case 
Po = 1, whereas the largest eigenvalue is always larger. In the limiting case j/ —>■ 0 both 
expressions on the right-hand side reduce to as in the completely observed classical 
case, independently of po- 

As in Theorem 1 of Bai and Yin (1993) the existence of the fourth moment is necessary 
for the above Theorem to hold. The proof of the necessity is a straightforward adaption 
of the arguments in Yin, Bai and Krishnaiah (1988). 

The characterization of positive definiteness in the null case under the missing at random 
scenario is an immediate corollary of Theorem 3.3. 


Corollary 3.4. Under the condition of Theorem 3.3, 

lim Amin < 0 o.s. if Po < 1 - (1 - Vvf, and 


d—>oo 


lim Amin > 0 a.s. if po > 1 - {1 - y/yY 

d—¥oo \ / 


4. Proof of Theorem 3.1, Part I 


Reduction to the form 

n 


With the notation 


7^ _ 7* R 

^d,n — ^^d,n^d,n'^d,n^d,n ^ d,n 


and 


ry ^ Tnd.xn ry ^ik.d.n^ik.d.n . . t 7 . 

^d.n € 7 ^ik.d.n — ' '7 Z = 1, . . . , d, k = 1 , . . . ,n, 


1/2 

P^,d,n 


let pLd,n be the spectral measure of Td^n- The aim of this section is to show that the 
spectral distributions pd.n of „ may be approximated by pd^n ■ 

Proposition 4.1. Grant the conditions of Subsection 2.2. Then 

dh {pd,n^ pd.n) t 0 a.S. 


Remark. Corollary 3.2 can be equally deduced from Proposition 4-.1. Since in that case 
Sd,n is a multiple of identity, the eigenvalues satisfy 


Ai {Td,n) — Ai 


^d,n^d,n-^d,n 


1 -Po 

-( 

Po 


i = 1,... ,d. 


For the matrix 



n 


1/2 

d,n 


■'d,n 


z: 


d,n 


R 


1/2 

d,n 


it is well-known (see e.g. Silverstein (1995)) that the spectral distribution converges weakly 
to almost surely as d/n ^ y > 0. 
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The proof of Proposition 4.1 is postponed to Appendix A. At this place we give a sketch 
of the proof. Subsequently we restrict our attention to the estimator Td^n- The proof for 
Sd,n is just a simplified version. 

The proof of Proposition 4.1 is subdivided into eight steps. In each step is modified in 
a way which does not affect its spectral distribution asymptotically. In order to simplify 
the notation each modification of „ from one step will be again denoted by Td^n in the 
next step. Within the proof denote 


Wd,n G W,,,d,u = 


n 


N, 


Wd,n G W,j,d,n = 


ij,d,n 

n 


E#Afij,d,n 

Before we start with the description of the proof we rearrange the entries Tij^d.n as follows 

^ ^ iy^ik,d,n '^^ik,d,n) (^,cZ,n '^^ik,d,ri) ^ 

^ ^ {y^jk,d,n ^^jk,d,ri) i^j^d^n lE^/c,(i,n) ^ 


K 




Nij,d,n 


E 


Ryk,d,n') ^.r ^ ^ {yi,d,n Ryi,d,n') 

iVv-' -- 


' ieWi 

1 


{^jk,d,n R^jk,d,n') pr ^ ^ {^jl,d,n R^jl,d,n) 

’ j j.d.n . r 


Therefore, we may assume without loss of generality Yd^n to be centered. Rewrite Td^n 
in the following way 

Yd^n — l^rf,n ^ ^(Ez,n ^ ^d,n)(Ez,n ^ ^d.n) ^ ^^d,n ^ ^ ^ ^d,n^ ^ 

^ ^(Ez,n ^ ^d,n){^d,n ^ ^d,n') ^ 

T ~Y^d,n ^ (^{^d,n ^ ^d,n^{^d,n ^ ^d,n') ^ ; 


where 


^d,n (^rf,n; ■ ■ ■; ^d,n') ^ 


Tjdxn 


with rhid.-n. = 


1 


N, 


kGAf, 




(4.1) 
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Let us briefly describe the separate steps of the proof. The first three steps use the 
inequality 

< irank(^ —i?) 
a 

for Hermitian matrices A, B G in order to regularize certain rows of £d,n for which 

the probability of observation Pi^d,n is smaller than some given value po > Oj to get rid 
of the additive term 

O ^{Md^n O £d,n){_^d,n O ^d^n) ^ ? 

and to truncate the diagonal entries of Td^n- Thereafter we want to make use of the 
inequality 

< ^ti{{A-B){A-Br), (4.2) 

where, in our case, A and B are two d x d random Hermitian matrices. In order to 
deduce almost sure convergence to 0 of the right-hand side by means of the Borel-Cantelli 
lemma, truncation of the random variables Xik,d,n is necessary to guarantee the existence 
of higher order moments of the empirical spectral distribution of This is realized in 
Step IV. In Step V the matrix Wd,n is replaced by its deterministic counterpart Wd,n the 
evaluation of which is based on a sophisticated combinatorial analysis of moments. In 
Step VI a combination of both inequalities displayed above is applied. More precisely, an 
entry Yik,d,n is preserved depending on whether its absolute row sum exceeds 

a certain value or not. The number of removed rows is asymptotically negligible while 
the remaining matrix is suitable for an application of (4.2). Hereby, the matrices 

^^d.n ^ (^{d(dd^n ^ ^d,n)(Yd,n ^ ^d^n) ^ and A^d,n ° {^(Yd,n ^ ^d^n'){d(dd^n ^ ^d,n) ^ 


are removed from Td^n- The form 

Wd,n = Wd,nWd^n -I- diag {Wd,n “ Wd^nWd^n) 
is the motivation for replacing 

- diag iWd,n - Wd,nWd „) O ((Vz.n O ed,n)iYd,n O £d,n)*) 
n 

by its expectation in Step VH. Reverting finally the truncation Steps H, HI, IV yields 
the claim. 

In the next section denotes the matrix 


1. 

-{Wd,r 

n 


<n) O {{Yd,n O ed,u){Yd,u O ed,n)*) - Sd,n = 


pl/2 _ q 

X^dr). ^dA 


which is obtained in step VIIL Correspondingly, we write and rrid n for its spectral 
measure and the Stieltjes transform. 
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Remark. In the case of non-diagonal Td^n we cannot reduce the sample covariance 
matrix with missing observations to the form 


d 1/2 y 




but instead have to analyze the spectrum of 

~ ^ ^ ^d,n^{^d,n ^ ^d,n) ^ Sd^n — “ {^d,n ^ (^d,n C> Sd^n 

with 

Yd,n = diag{w)Yd,n- 

Nevertheless, the arguments of Section 5 can be modified at the cost of additional technical 
expenditure. We find that the ideas of the proof are much clearer for the diagonal special 
case and therefore omitted this extension due to length of the paper. 


5. Proof of Theorem 3.1, Part II 


Note that, in general, the spectral analysis and limiting behavior of significantly 
differ from those of the matrix analyzed in Bai and Silverstein (1995). By Proposition 
4.1 as well as Lemma C.12 and Lemma C.13, we continue to show that 


I rnd,n{z) - rn°d^^{z) \ — S> 0 a.s. 

for all z S C’*'. Such type of convergence has been established in Couillet, Debbah and 
Silverstein (2011) for 


d1/2 


1/2 


Ad 


for positive semidefinite Hermitian matrices Ad^n, Bd^n G For the proof of Theorem 

3.1 we establish the weak approximation in case of the negative semidefinite matrix 
Ad,n = —Sd,n- This requires several changes in the arguments of Couillet, Debbah and 
Silverstein (2011) due to the fact that the function 

1 

z{l + m{z)) 

is a Stieltjes transform if to is a Stieltjes transform of a finite measure on [0, oo) but in 
general, this is not true any longer if to is just a Stieltjes transform of a finite measure 
on R. Moreover, our proof includes also the case d/n —>■ 0. 

The proof is structured as follows. In the first step we truncate the entries of Xd^n at 
the threshold level K > 0 which goes to infinity at the very end. Afterwards we start to 
analyze the Stieltjes transform of the empirical spectral distribution of With the 
resolvent 

Gd.niz') = (^Zld,n zidxd) 
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we prove that 




is an approximate solution to the fixed point equation in Theorem 3.1 in Step II. Corre¬ 
spondingly, the Stieltjes transform „ is shown to be approximated by the expression 
(3.1) with ed,n in place of In the third step existence and uniqueness of a solution 
to the system of equations for „ is established. The solution „ is identified as a 
Stieltjes transform in Step IV. In Step V and VI, pointwise almost sure convergence of 
Cd.n — Sdn '>TT'd,n — „ to zero is derived. Finally, we deduce the weak convergence 

^j-d^n ~ P-°dn 0 almost surely in Step VII. 


5.1. Step I: Second truncation of Xfi,n 

For arbitrary K > Q define matrices Xd,n, Zd,n and Ed,n = - Sd,n, 

where 

x,k = X,kl{\X,k\ < K} and Z^k.d.u = 

Pi,d,n 

Moreover, define for arbitrary S > 0 the event 


A 1 

^i,d,n — \ 


Y^Xfi-EX, 


1^1 


1 

V - 
n 


Y,Xll{\Xu\ > K}-EXll{\Xu\ > K} 


1^1 


<S 


With this notation, let 


= l^Rd,nZkniZ'd,nrRd(n " Sd,n and = ^^Rd^n ZdA^kuT " ^d.n, 


jl/2 


where 

and 

Then, 


Xik^d.n Xik^Ai^d,n^ Xii^d,n Xik^AicL,n 

■ih ri rt i /<-. 5 ^ik^d,71 


“'ikjd^n 


1/2 

P^,d,n 


1/2 

‘^i,d,n 


d-L 


< 


dL +dL + dL . (5.1) 


First, we evaluate the second term di(/t“‘i.",/i“'*.") in (5.1). By Theorem C.IO for a = 1, 
the Lidskii-Wielandt perturbation bound (1.2) in Li and Mathias (1999), and Holder’s 
inequality for Schatten norms. 
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i=l 


< 


< 


1 

dn 

1 

dn 

1 

dn 


Rd^,nZkniZ',^nrK',: ” 

ZdAZknT - Z'd,niZd,nr ^ 

Ol 

+ z'd,M,r. - Z'd,ny 

X ll^d.nllg^ 


1/2 pi/2'7/ / 7 / 


,1/2 


Si 


< - 

dn 

1 

< — 
dn 


{zy^ - zyjizy^ - zy^r II + 2 II (Z'_„ - J* 


I Si 


\\Rd, 


"II So 


< 


dn 


{zy^ - zyjizy^ - Z',„) 
x|| 

tr {{Zy^ - Zyj{Zy„ - Zyj*) 


/ A* 

Si 

X I|-Rd,n||g_ 


+ 2 


7 ' 7 ' 


S2 


Z'dd 


S2 




As in Subsection A.l let po > 0 be the lower bound on Pi^d,n, i = 1, ■ ■ ■, d and d G N. 
With this notation, we show that 


1 


sup — tr - ZyjiZy^ - Zyj*) < 


EXf^l{\Xii\ >K} + S 


Po 


while 


We have 




1 

dn 


d 


T ^ '.^^. {Z-if^ d.n Z^k^dju') 


dn 

i=l k—1 

1 / 1 

< ^ max 


< 


dn i — l,...,d \Pi^d^7i 

EXfilllATiil >K} + d 
Po 


>df} 


fc=l 
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Moreover, 


d n 


i—l k—1 


< -— max 


dn i=i,...,d \Pi^d, 


E X!k,d,uH\X^k,d,u\ < K} 


i=l 


k=l 


< 


EX. 


11 


Po 


As concerns the first summand in (5.1), it holds P(Ai_£;,n) —t 1 as d —)■ oo by weak law of 
large numbers. Note that P(Ai^£;,n) = ]P(A 2 ,d.n) = • • • = P(Ad_d_„). Then by Hoeffding’s 
inequality for sufficiently large d, 

P (E > m) < P (u;,, - P(AJ,,„)) > lsd\ 




Hence, by the Borel-Cantelli lemma 


limsup - 1 a= < S 

j __ a i,d,n 


d—^oo 


almost surely. As in inequality (A.2) of Subsection A.4 we deduce 
limsupdi I j < limsup dx () 

d—^oo ^ ^ ^ 


d—>-oo 

< lim sup - rank 

d—>^oo ^ 

< 2d 




n 


almost surely. The third summand in (5.1) is bounded in the same way. Putting things 
together in right hand side of (5.1), 


limsupdz, I j 

d—¥00 ^ ' 

< 4(5 + sup \\Rd,n\\l^^ 


EX^^l{\Xn \ >K} + S 


Po 


+ 2 


Y^EX^]^ 1{ |Xii I > X} + S\/S EX 


Po 


1/2 
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almost surely. Since <5 may be chosen arbitrarily small, we conclude 
limsup di () 

d—¥oo ^ ' 


< sup \\Rd,n\\l^^ 

d °° 


Po ^ Po 


In turn, the last expression can be made arbitrary small for K sufficiently large. Since the 
centralization of the truncated random variables Xi^ leads to a finite rank perturbation 
of (uniformly in d), we may assume the entries of Xik to be centered. In the following 
denote the centered truncated random matrix again by Xd^n- Then, analogously to the 
truncation step by replacing l{|Xife| < K} with in th® definition of X we 

may assume the entries to be standardized since the variance of the truncated variables 
converges to one as the truncation level tends to infinity. Therefore, in the rest of the 
proof we analyze the matrix 


- ~^d,n^d,n^d 


_ c 

^^d,n ^d,i 


where the entries of the matrix Zd^n are centered, standardized and bounded. 


5.2. Step II: Approximate solution to the fixed point equation 
(3.1) 

Subsequently, we assume that 

7 ^ 

liminf— > 0. (5-2) 

d—^oc Tl 

The general case is treated in Step VI. Recall that p,d,n denotes the (normalized) spectral 
measure of and denote its Stieltjes transform by 

md,n{z) = j Y^dfj.d,nW, z G C+. (5.3) 


We use subsequently the following abbreviations for the resolvents 

Gd,= {^d,n - zldxd^ and (z) = - z/rfx d) ,k = l,...,n. 

For z G C'*', define 

^d,n(^z) = ^ tr yj(z)^ . 

Our goal in this step is to show that 


)tr(D,-;(,)) 


'md,n{z) 0 a.s., and 


(5.4) 
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1 

d 


tr 


{Rd,nDj,l{z)) 


ed,n{z) 0 a.s. 


with 




1 + 




d,n 


^dxd- 


Let Ed^n = Od,n^d,nO’^ ^ denote a spectral decomposition, where 


■^d,n diag(Ai - ■ ■ j ^d^dju): 


(5.5) 

(5.6) 


and define n = n^d,nOd,n- With this notation, 


ed,n{z) 


^ tr |^-RcZ,nGc;^7T,(z) j* 

— tr {Od^n^d^nOd^ri zidxd) ^ 

^tr{^d,n {Od,n [Ad,n “ zidxd] 

2 [Rd,n (Od,„ [Ad_„ - Zldxd]~^ Ol r^ } 
2 tr {05,„-Rd,nOd,n (Ad,n “ zldxd)~^'^ 

— tr ^Rfi n {■^d,n ~ zidxd) | 


d TD 

—n,d,n 

■ . ^i,d,n 

2=1 ’ ’ 


(5.7) 


Since Rd,n and therefore R^ „ are positive semidehnite, the diagonal entries R^^ dm — 
1, ...,d, are non-negative. Hence, ed,n is the Stieltjes transform of a measure on K with 
at most d support points and total mass 


tr Rd n- 

d 


Note that ^d,n is not necessarily positive semidehnite, hence the support points are not 
restricted to [0,oo). As a Stieltjes transform, 

ed.n:C+^C+. (5.8) 


This implies in particular that Dd^n{z) as dehned in (5.6) is in fact invertible by means 
of Lemma C.3. Moreover, since ||7?d,r!,||Soo ^ ^ ^r some constant k > 0, it follows by 
Holder’s inequality and the positive dehniteness of Rd,n, 


|ed,n(-Z)| ^ 7 lli?(i,nlls'l {-^d^n zIdxd) 


\-l 
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= ( ^ tr Rau 1 max 


l<i<d z\ 


< 


(5.9) 


Let Zk^d,n be the fc-th column of the matrix and define 

^d,n ^k,d,nYk,d,m k = 1, . . . 


Yk,d,n — r^Rd,nYk,d,n and 


which arises from ^d,n by taking away the fc-th sample vector, and recall (5.6). Then, 

't'k,d,nt!'k,d,n ~ .j j 

k=l 


d,n Z:Idxd Ytd^niz^) — ^ ^ Yk^d,nY^^^ j^ 


1 + ied,n{z) 


ktd,r 


whence 


D. 


l,n(^z) ^Gd,niz) (^^d,n zl^ 

— Y)d^n{z) zldxd^ 


1 + ^ed,niz) 


Rd,n ^ ', Yk,d,nYf^ ii^. 


k=l 


Therefore, 


GdA^) - Dll,{z) = - ^ D-Uz)Yk,d,nYld,nGdA^) 

~Gd,nA^d,’^Gd,niz) 


fe=l 


1 + Ad,nA 
^ y DllAYk,d,nYA,nGiliz) 
k=l ^ YX^ Gi^{z)Yk^d,n 


+ 


1 


1 + ied,nA 

where (5.10) follows from Lemma C.l. Altogether, 


Rd,nA^d,nGd,niz), 


1 1 ^ 

{diKz)'^ - md,nA = 


fc=l 


with 


(5.10) 


(5.11) 


fk.m — 7 

a 


1 ^fc*.d.n<„'Gg(z)l?,-i (z)<„"Zfe,,.„ ^ tr (Rd,nGdA^)Dd!nA)) 


^ + Y*Gf^{z)Yu,d, 


1 + AdA^) 
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Multiplication of the matrix equality (5.10) with Rd,n from the right, we deduce 

1 " 

, tr (^Rd,nDjXiz)j - ed,n{z) = 


1 

d 




with 


fk,e 


^ 1 + y^* j gO (2)Yfc,c!,n 


tr (^Rd^nGd,n{z)Rd,nD^ l{z) 

1 + ^^d,n{z) 


Subsequently, we show that 


lim -y^fk,x = 0 

d-^-oo Tl 


a.5., X = e,m. 


(5.13) 


fc=l 


First observe that 

Yk:d,nGfliz)Yk,d,n = tr (z) 

is the Stieltjes transform of a measue on K with total mass Hlfe.d.nllij following the 

. (/c) 

'l,c!,n’ '^d,d,n 


(k) (k) '' (k) 

arguments in (5.7). Next, with A) ^ ..., Xddn denoting the eigenvalues of S); 


G^liz) 


1 


= max 




< 


(5.14) 


The same holds true for Gd,n{z) in place of G^dh^z). Therefore, 

YUuG‘'dl{z)Yk,dd 


< 


\Yk\\l 


which gives 




< 


ll^fc,d,ri.||2 


1 - 


rfc,d,iill2 


< 1 . 


(5.15) 


Denoting with OAO* the spectral decomposition of and = {0*Yk,d,nY^d,nG)i 
for the moment, we obtain for HTfc j; „||2 > 0 the bound 




< 


{yk,d,nG^d}i^^)Yk,d,i 























Spectral analysis with missing observations 


21 






< 


Hz) Eli 


Ak) 


2 max; |A<^j_„|2+2|z|2 


< 


2 maxi 

\ (fc) 

' + 2|z|2 

C 

K^)linili 


(5.16) 


Combining the first bound (5.15) in case ||yfc,d,n|| 2 /^- 2 ^ ^ 1/2 with the second bound 
(5.16) if ||Yfe,d.n||i/3-z > 1/2 yields 


^ + YUuGdliHYk,d, 


< 2 


I \ (^) 

max* X]! 


^(z)2 


■ + 1 


2 maxi 


< 


v(fc) 


■4|z( 


Finally, due to 


and Lemma C.7, 


I Soo - II II So, 


9(z)2 

n 

Y.^Ud,nYU. 


Z=1 

l^k 






< C < 00 


(5.17) 


(5.18) 


almost surely for some constants C, c > 0. Define 

e?n = (i?d,„Gg(z)) , k e {l,...,n}. 

Note that analogously to (5.7), is a Stieltjes transform. Using (5.14) and the argu¬ 
ments of (5.15) for the case n“^ tr(i?£i_„)/9(z) < 1/2 as well as (5.16) forn“^ tr(i?d_„)/S5(z) > 
1/2 we obtain analogously 




2 maxi 


< 




■4|z| 


S(z)2 


and for some constants C, c > 0 


lim sup 

d—>-oo 


/2c + 4|z|2 
9(z)2 


-1 


1 + 


< C < 00 . 


(5.19) 


(5.20) 
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The same bound holds true for ed,n instead of in which case ^ are to be replaced 
by the eigenvalues Xi^d.n of Therefore, with 


^d.l = , max^ I (a|3_„) , A^ | , 


1 + ^erf.„(z) 1 + 


d 

n 


^ d ll^d,ri||Soo 1 
~ n d 



eii(z) 

- ed,n(z') 


(1 + ied,n{z)) 

(l + FeS(^)) 



(l + ^e,,„(.))(l + ^e«(z 


^ i- l|-Rd,n||Soo + 4|^|' 


^(2)2 


(5.21) 

(5.22) 


where inequality (5.21) follows from Lemma C.2 and (5.22) results from (5.19). Further¬ 
more, with 


D^liz) = 


it follows from Lemma C.3 that 


1 


’ ' ’ddd n Sd n zidxdi 




< as well as 




1 

< 

— C> 


(5.23) 


— ■ (5.24) 


We begin with establishing (5.13). To this aim, let 

Idxd for X = m, 
Rd,n for X = e. 


dd^.d.n. — 


We decompose 


f _ f[l] I f[2] I d3] p[4] 

Jk,x — -r ,/fc -r ,/fc 3, -I- 


/ [l] _ ft;,Cl, 

k^x j 


1 ^k,d,nd^d,n ^d,n i^)ddx,d,nD^ {z)RJ Zk,d,r. 


1 + '^k,d,n^d,n(z)yk,d,n 

1 Zld,uRd,nGfl{z)E,,d,n {P^liz))'" RZ^k,d,. 
^ 1 + YUr.Gfn{^)Yk,d,n 


where 
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.[2] ^ 1 
k.x j 


AnRd,nGil{z)E,An (^S(^)) 




1 

d" 


tr (Rd,nGf,l{z)E,An (<1(^)) 

1 + ^k,d,nGd}i{z)Yk^d,n 


-1 


A3] ^ 

J k,x j 


^tr (Rd,nG^!l{z)E,,d,n 


-1 


1 + 'Zk,d,nGd,h{z)Yk,d,n 


^[4] ^ 1 

fc,a: j 


^ 1 + ^k,d,n^d,n(^)^f^4,'n 

'-^{k) 


^ l+n%,nG^nW>^M,n 

d 1 + ^ed,n{z) 

Using Lemma C.l in (5.25) as well as the spectral norm bounds (5.22), (5.14) and (5.24) 
in (5.27), we obtain 


f[il 


1 ^:,..„<'gS(z)u,,,. 
d 




-1 


pl /2 7 - 

^d,n^k,d,n 


1 + YUnG^dl{z)Yu,d, 


77 . ^ 

^YA,nGd,nAEo.,d,n 

Dd!M-[A^l{z)y" 

^k,d,n 

^^k,d,nGdAz)Ex4,n 




(5.25) 


Df,liz) - Dd,n{z) D-^z)Y,,d, 


<^\\Yk,d,n\\l\\GdAz)\\ \\Ex,d, 


\Dl!:l{z)-Dd,r. 


(5.26) 


d,n V 
2 


< jll^fc,d,n||2 


X \\Ed,niz 

(2^S + 4|xp)"l|i?d.n|lL \\E-A, 


(^S(^)) 


-1 


"11 So, 


(5z)8 


(5.27) 
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^ , ||■Z^fe,£^,r^||2 
an 


'i*^+4kP) \\Rd,nrs^\\E.,d, 

7^ 


By (5.17), 


^[2] 

k,x 


tr 


{RZZk,d,nZl,^^R^£ - GS(z)i?..,.„ (7?S(z)) 


-1 


^ + 4fe*d,n*^d,n(^)4fc.d, 


2 max, 


< 


A 


(fc) 


i,d,n 


+ 4\i 


5(z)2 


tr 


(^7 




Furthermore, using (5.17) in (5.28), the invariance of the trace under cyclic permutation 
and Lemma C.2 in (5.29) for the first term in the curly brackets and the spectral norm 
bounds (5.22), (5.14) and (5.24) in (5.30) yields the bound 


f[3] 


tr 


Rd 


'd,n (G^l{z)E,An {D^liz)) ' - GdAz)E.,d,nDll{z) 


^ + YkdnGfl{z)Y,,d, 


2 max,- 


< 


Ak) 


+ 4|^ 


C>(2)2 


>^7 


tr 


Rd,n (g'S(z) - Gd,„(z)) E,,d,u {Df^liz))' 


2 max,- 


1 

+ d 




tr 


Rd,nGd,n{z)Ex^d,n ( {Ed,n(^)^ Ed,n(^) 


< 


t{zy 



E.,d,n{Dfl(z)) \d, 


(5.28) 


(5.29) 


tr 


Rd,nGdA^)E.,d,nD^U^) [Dd^r.{z) - Dfliz)] ( g «(^)) 


2^ 2 maxi 

< - 

- d 


Gk) 


+ 4\z\' 


Q{zy 


(Qzy 




(5.30) 
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"II So 






II '^a:,cZ,n 


Finally, using (5.14) and (5.24) in (5.31), (5.17) and (5.19) in (5.32) and Lemma C.2 in 
(5.33), 


/; 


[4] 




n ^ tr 


- n-^Zl,^^RlGG[l{z)R\Gz,^d,n 


1 + Z^d,nGd,niz)Zk,d,n 

(l + ^e.d,n{z)) 


^ 1 ||.Rd,n|l5j ||.£'£c,d,n|l5^ 

“ d (9z)^ 


(5.31) 


n ^ tr 


- ri-^Zl,^^RlGG[%)RlGZk,d,n 


^1 + YkM,nGd^h{z)Yk,d,n^ 

(1 + ied,n(.z)) 


< - 
d 


l\\Rd,n\\s,\\E.,d,n\\s^ ( 2(^S)'+4|^r 


(3z)2 


^(0)2 


(5.32) 


tr 


<'G,.„(z)i?:;„^ - ^zi^d,nK[:G\z{z)R:,[:z,^d, 


?l/2 


l/2A(fc), 


jl/2. 


< 


1 ||.R(i,n|l5j llTllx.d.nllg^ 

d (3z)2 


/. 


(V'S) +4h 


^(z)2 


V 


tr 


RZGfl{z)R^ - ^Zld,nRdi:GZi^)Rdi:Zk,d, 
1 ll-Rrf.nllSoo 1 


jl/2 


pl/2 / 


(5.33) 


Based on these estimates on I = 1,2,3,4, we are ready to prove (5.4) and (5.5). In 
the next display, c > 0 denotes a constant depending only on the support of Zn, and 
may change from line to line. By means of Lemma C.4, Lemma C.5, Lemma C.6 and the 
spectral norm bounds (5.14) and (5.24), 


E 


fin 

•f k,x 


6 ^ Pd.nllSoo II 


iL 




■E|||Zfc,rf,„||i2 (2V;i5 + 4|z|2)®| 


< ^IjWRd^nWZ ||i^..d,n|lL 




(e\\Z k,d, 


1/2 
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X E ||S'd,„|||_^ +max' 






Z=1 

l^k 



12 \ 1/2 


+ Z 


< + ll«^-ll£ + l-l”) 


E 


M 

J k,x 




n®(9z) 


c f/ 2 max, |a/ 2 „| + 4 |z| 2\6 


u- 


C>(2)2 


R 




R 


.1/2 


E 


E 


.[3] 


/; 


f[4] 

J k,x 


< 


< 


ll^<^.»llL ll^-.^.-llL {\\Sd,n\\Z + \\Rd,n\\Z + \zr), 

P.,n|lL \\Ex,,,nfs^ (ll^d,n||^i + Pd.nll^L + k| 


< 


Prf.nllsL WExAnfs^ (ll*5d.n|||l + Pd.nllli + 

(||5d,„|||‘^ + ||^d,n|||t„ + \z\ 


6 ^ Cd^ \\Rd,u\\f^ WExAnWl 

n3(S5z)42 


In order to show finally (5.13), it remains to note that for any e > 0, 

oo / ^ n \ oon4 




fc=l 


d=l fc=l /=1 
oo n 4 


< 


CXJ IL 4 : ^ 


d=l fe=l /=! 


< OO 


by an application of the union bound, Markov’s inequality, and (5.2). (5.13) is then a 
consequence of the Borel-Cantelli lemma. 


5.3. Step III: Existence and uniqueness of 

We show that for any d, n and Rd,n, there exists a unique e(z) £ C"*" which solves the 
fixed point equation 





Rd,n - Sd,n - Zidxd } , Z G C+. (5.34) 
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To this end, define for any fixed d,n the subsequences and (n/)igN, where di = Id 

and ni = In, I G N, and correspondingly the /-block diagonal matrices 


dS(^d^n)i — diag {Rd^n: -j Rd,n') and S^d,n)i — biag {Sd^n: - : Sd,n) 

of size dl X dl. Note that the right-hand side of (5.34) remains unchanged when replacing 
d,n,Rd,n and Id^d by di,ni,R^d,n)i and /d,xdi- By (5.5) of the previous section, 


^{d,n)i{z) , tr < R(d,n)i ( , dl ( \ 

I \^+:^e{d,n)Az) 


R{d,n)i ^{d,n)i zldixdi 


a.s. as / —>■ (X) with 


^{d,n)i{z) tr zldixdi''j 


where 


nl 


^{d,n) 


nl^^' 


1/2 y y,, r?^/^ 

(d,n)i ^’^i-’'^^f^idi,ni-^(^d,n)i 


-5, 


(d,ra)l I 


k=l 


Z = {Zik)i,ke'N is a double array of iid Rademacher variables, and Zk^d.n is the fc-th 
column of the submatrix Zd^n = {Zik,d,n)i<d, k<n- Consider a realization of these random 
variables where this convergence occurs. First note by (5.9), 


e(d,n)i{z)\ < ^ V / e N. 


By Bolzano-Weierstrafi, there exists a convergent subsequence of {e[d,n)i) with limit e(z), 
say, such that in particular 


1 

1 (^) 


1 

l+)^e(z) 


(5.35) 


along this subsequence due to (5.20) for e(d_„),(z). By (5.5), e{z) solves the fixed point 
equation (5.34). As 5 (e(d^„),(z)) > 0 for any I G N and x G C+, it follows that its limit 
satisfies 3(e(z)) > 0 and therefore 'A {e{z)) > 0, because 'A{e{z)) = 0 contradicts with 
e{z) being a solution of the fixed point equation. Consequently, any such solution e of 
(5.34) enjoys the following two properties: 


e : C+ ^ C+ 


(5.36) 


and 

|e(z)| < ^ VzGC+. 

It remains to show uniqueness. Denoting 

7 q,(z, e(z)) = - - ~Rd,n Rd,n 

1 + Fe(z) 


zidxdj 


(5.37) 


(5.38) 
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we obtain the representation 


. N 1 

e{z) = ^ tr 


^4Dll{z)Rd,n {D*Jz)y 


l + ie{z)* 


^Rdn. Sd.ri ^ ^d. 


Note that (>!*) ^ = {A Now, the expression 


tr (bl^{z)) ' S,A > 0 


(5.39) 


is in particular real because the trace of the product of two positive semidefinite Hermitian 
matrices is non-negative. Hence 



)l 


= a(e(z))9(e(z)) -h /3(e(z))9(z) 


with 


a(e{z)) = ^ 
n 

P{e.{z)) = ^tr 


1 -I- -e{z) 




D. 


:),{z)Ra,n [Dlniz]) 


Note that both, a and /3, are non-negative, and a(e(z)) > 0 implies /3(e(z}) > 0 since the 
trace of a positive semidefinite Hermitian matrix equals zero only for the null matrix. If 
e{z) is another solution of (5.34), we obtain the analogous identity 


9 (e(z)) = a (e(z)) 9 (e(z)) -I- /3 (e(z)) 9 (z). 

We denote by Dd,n{z) the matrix Dd,n{z) as defined in (5.38) with e(z) in place of e(z), 
and define a{e(z)) and (3(e(z)) correspondingly. Then 

e(z) - e{z) = ^ tr I (^Dlliz) - Dll{zy Rd,n} 

= ^ tr „(z) (^Dd^n{z) — Dd,niz)j jj(z)i?d_„ I 












Spectral analysis with missing observations 


29 


~ ^ “I ..W^ ^d,nDdniz)Rd,n 


(1 + - (1 + ie{^)) 

(1 + ie(^)) (1 + fAz)) 


- (e(2) e(2)) ie(z))d*''' 

=: (e(z) — e(z)) 7 - (5.40) 

If 7 = 0, uniqueness of e{z) follows immediately. In case 7 7 ^ 0, we deduce the inequality 

. n 1/2 

d/n 


I 7 I < 


L|i + ^W 


1 


tri^-i(z)i?,.„(^ 5 _„(z)) Ra,r 


1 + de(z) 

= \/a{,e{z)) ■ \/a(e(z)) 

!3(e(z))a(e(z)) 


J tr {5-1 {Dl^{z))-^ Rd,u} 


1/2 


(5.41) 




1/2 


f(e( 2 ))a(e(z)) + 3(z)/3(e(z)) j 

3(e(z))a(e(z)) 


1/2 


^3(e(z))a(e(z)) + 3(z)/3(e(z)) _ 

But /3(e(z)),/3(e(z)) > 0 for Q;(e(z)), a(e(z)) > 0 which implies I 7 I < 1 and therefore, 


5.4. Step IV: Identification of and rn°^^ as Stieltjes 
transforms 

As concerns we know already that „ : C"*" —>■ C+. Its analyticity follows by the 
analyticity of the pointwise approximating sequence e(£; and the local boundedness of 
{e(d,n)i) on C'*'. Note that the pointwise convergence occurs simultaneously on a countable 
set with a accumulation point in with probability 1. Using on the right hand side of 
(5.34) the fact that „(z) —?► 0 as 3(z) —?► 00 which follows from (5.37), we also have 

-t -^tr(i?rf^„) as 3(z),3ff(z)-)> 00 . 

Hence, Lemma 2.2 in Shohat and Tamarkin (1943) implies that „ is the Stieltjes 
transform of a measure on the real line with total mass d~^ tr(i?£j,ri)- 
Define ^ 

^d,ni^) ~ Z ; d o T\^d.,n ~ Sd,n ~ zldy.d- 

1 + n^d,u\^) 

Finally, observe that for any z € C’*', 

5^K.nW) = j3tr|D“,„(z)-i ((D°_„(z))*)”' (D°,„(z))*| 


(5.42) 
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V+iKni^))* J 


1 ^ 


(ed.„(- 


1 + ^2.„W 


tr (dI„{z)-^ (^{Dl„{z)Yy'R,, 


^9(z)tr (d°Jz)-^[{D°Jz))*) 


-1 


>0 


(5.43) 


since both 9(z) and 3 (^^dni^)^ strictly positive. Furthermore, since e2„(z) —>■ 0 as 
^{z) —>■ oo by (5.37), we conclude 

z • m°^ ,^{z) —>■ —1 as 3(z), 3?(z) —>■ oo. 

As above, m)) ^ is the Stieltjes transform of a measure on the real line with total mass 1. 


5.5. Step V: Approximation of Sd^n by 

Let denote the solution of (5.34). We will show that for any z S C+, 
Bd,niz) — e'^ ^{z) ^ 0 a.s. as d^oo. 


(5.44) 


Define 

such that 
Noting that 


a°{z) = a {e°d^^{z)) and (3°{z) = /3 (e^niz)) 
S {e°d,n{z)) = a°{z)^ (ed,„) + /3°(z)3(z). 


(5.45) 


^^<\\Rd,n\\s^^ 
p [zj n 


r) ’ 


we deduce 


a°(z) 


= -||b’d.n||s„„3 


1 + P^^XniZ') 


-2 


(5.46) 
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< ll^d.rillSo, 


1 


2 max, 


< ||^d,n||s„ limsup ■ 

I—>00 


(^■=‘(d,n)i^ +4:\i 


^( 2)2 


(5.47) 


where the last inequality follows by convergence (5.35) and bound (5.19) (in the latter 
the eigenvalues corresponding to have to be inserted). As a consequence, 


a°{z) = 


(e3.„(^)) a°( 2 ) 


< 


jz)/3°(z) + 3 (e° „(z)) a°{z)^ 

2||fld,n||s^||Srf,„|||^+4|z|2 

( 9 z )3 + 2 ||i?d,«||Sool|S<i.n|l|„„ + 4 | 2 :| 


(5.48) 


(5.49) 


where the first identity (5.48) follows by rearrangement of (5.45), and after expanding 
the fraction by (/3°(z))“^ we used the elementary inequality 


y + x y + z 
and (5.47) in (5.49). By (5.12), 


X z 

- < - for x,y,z > 0 and x < z 


1 1 " 
ed,n{z) ~ ^ (^^d,,nDd,ni^)^ ~ ~ ^ fk,e- 


k=l 


Then as previously in (5.39) and the subsequent display, we obtain the representation 


5(ed,„(z)) 


and as in (5.40), 


- in 

- -3(^*)tr{n-i(z)i?,,„ {Dl„{z))-"] - -5]9(/fc,e) 

^ k=l 

1 ” 

O' {ed,n{^)) a {ed,n{z)) + ’^{z)l5 (ed,n(z)) - {fk,e), (5.50) 

n 


Sd,n{z) ^dni^) 


7 {ed,u{^) - e°d,n{z)) 


1 

n 




(5.51) 


with 


Ibl < \Ja°(z)a{ed,n{z))- 


( 5 . 52 ) 
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Consider a realization for which the convergence 

1 




k=l 


occurs. Then in particular, 


71 


fc = l 


< ^{z) 


4d(||i?d.n||s„„ VI) 


2||Sd,„||L+4|zr 

9(z)2 


-2 


for sufficiently large d. Recall that by definition of a{ed,n{z)) and P{ed,n{z)), 

-2 


a(ed.„(z)) ^ I, II d 

PC c ‘S’oo 


1 + -ed,niz) 
n 


(5.53) 


(5.54) 


Hence, if 


I3ied,n{z)) < 


2 ||- 




+ 4|z( 


4(i (Iii?(i,n||s„„ V 1) 
then inserting (5.19) into (5.54) yields 


^(z )2 


a{ed,n{z)) < ||i?d,n||s„ 


d / 2||S,,„|||^+4|z|' 
n i 9(z)2 


f^{ed,n{^)) ^ 


in which case (5.52) implies jyj <1/2 since a°(z) < 1 by (5.48) and the non-negativity 
of a°{z),P°{z) and '^{e‘^^{z)). Otherwise, if 


l3ie°dJz)) > 


2||S,,„|||^+4|z( 

5(z)2 


4d(||i?d,„||s^ VI) 

(5.52), (5.50), (5.53), and (5.49) imply 

^ {e-d,n{z)) a{ed^n{z)) 


I 7 I < pa°{z) 


S ied,n{^)) a(ed,niz)) + Q{z)f3{ed,niz)) - ^ Y.k=l ^ Uk,e) 

112 41^12 \ 


1/2 


< 


2||fl,,„|U^||S,.„|||^+4|z( 
(9z)3+2||i?d.n||s„.||S<i,„|||^+4|z 


As d ^ 00 the limes superior of the last expression is bounded by some positive constant 
7 ( 2 ) < 1 almost surely. Finally, solving the equation (5.51) for ed,n — ^dn using the 
upper bounds on jyl, we obtain 


I £d,n(z) ^d,n(z) | ^ 




1- (i V7(^)) 

0 a.s. 


(5.55) 


as d —>■ 00 , by (5.13). This proves (5.44). 
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5.6. Step VI: Approximation of md,n by 

Without loss of generality we may assume that either 

di ^ di ^ 

— >1 or — <1 
n n 

holds on the whole sequence. We start with the first case. Recall the definition (3.1) of 
and (5.42) of and note that 


while by (5.11), 


with 


Then, 


w3.„(z) = ^tr (^(£»5 „(z)) , 

1 1 ^ 

'^d,n — ^ ^ ~ ^ ^ fk,n 


n 


/fc,m 0 a.s. as d - 




1 1 ^ 
- m^niz) = - tr 




1 1 ^ 

2 (Dl^iz))-^} --Y h^rn 

^ k^l 

1 ^d,n{z) — e^^^{z) r 1 , . / o , x\-l\ 


^(l + sed,„(z))(l + ^e3_„(z)) 

So, almost surely by (5.24), (5.19) and (5.55), 
limsup \md,n - m°^,n\ 


E fk,m • 




d—^oo 


^ I on- ^IID II (2|l“d,n||s^ + 4|z| ) 

< hmsup \ed,n - „ hmsup - i?d,n Soo-- 

d->oo d->oo n is{z)^ 

= 0 . 



n 


1 . 


Now, consider the case 
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Due to 


d dsup^ 

z:\^d,n\ < 


n n 

for any z S C+ and by reasons of continuity, we conclude 




^ 0 


for any z S C"*", where 
remains to show that 


is the spectral measure of the matrix T^^n- Therefore, it 


md,n{z) - m^T^^„{z) 


0 . 


By Lemma C.12 and Lemma C.13 this convergence holds true if (iL(/Xd,n,—>■ 0. 
Theorem C.IO for a = 1 and inequality (1.2) of Li and Mathias (1999) yield 




d 

<5i:iA.(: 

2=1 


■d,ra) — Ai(Td_„)| < 


^ W W* p 

~^d,n^d,n^d,n^d,n ~ ^d,n 


Finally, for arbitrary e > 0 and d sufficiently large we apply Corollary 5.50 of Vershynin 
(2012) with t = 1 so that 


iRd,nXd,nXXr^Rd',:-Rd,n 


1/2 


< £ 


with probability at least 1 — 2exp(—c?). Again, by the Borel-Cantelli lemma, 

1/2 

pl/2 Y' Y”* f? 

n 


db {ldd,n,ld 


Td . 


< 


0 


almost surely as d —oo. 


5.7. Step VII: Weak approximation of the spectral measures 


First we show that the measure /x^ „ has compact support. Thereto, define similarly to 
the definition of e(^d,n)ii ^ G N in Step III, 


'm(d,n)i{z) 




By (5.4), 


^(d,n 



1 + n^id,n)iiz) 


R{d^n)i R(d^n)i zldixdi 


0 as I ^ OO 
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almost surely. Note that 

if/ 1 

^tr 




^{d,n)i ^{d,n)i zldj^xdi 


= \tr- 


^ 1 n^(.d,n)i[z) 

and therefore by reasons of continuity 


^d,n S(i,n ^^dxd 




1 


-1 


1 + n^°d.ni^) 


Rd,n Sd,n zldxc 


0 as Z —)■ 00 


almost surely because of (5.44). This implies that /i^ „ is the weak limit of /r(d.n)i) and 
in particular the support of „ is bounded since 

I inf {a; : Aid_„((-oo, x]) > 0} | > liminf > -||5'd,„||s^ 


and 


I sup {a; : /x2_„((-oo, a;]) < l} | < limsup ||S(d,„), ||g^ < ||5'd,„||s^ +c', 

l—¥oo °° 

where c' > 0 is a constant satisfying inequality (C.6) of Lemma C.7 applied to 


nl 

_\ ^ y y* 

^(d,n)i^>^A,mZ^k,di,ni^(d,n)i ’ 


k=l 

and is chosen uniformly over d S N. Subsequently, we assume that d (in dependence on 
the specific realization) is sufficiently large such that 

|inf {x : fdd,nii-oo,x]) > 0}| > -||S'd,„||s„„ - c" 


and 

|sup{a; :/rd.n((-oo,a;]) < 1}| < ||5'd,„||so„ + c" 

with an appropriate contant c" > 0 from (C.6). Now, define c = c' V c". For fixed 
0 < u < 1, define the closed interval K = [uq, with 

ui = (||5'd,„||g^ + c) + (||5'd,„||s_^ + c) 

for Z = 1,..., + 1. By Step VI, we have 

\md,„{ui + iv) - + iv)\ < v 
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simultaneously at all points ui, I = 0,... ,[v + 1, almost surely for all d sufficiently 

large. Furthermore, for any inner point u of K, pick I such that u G Then, 


md,n{u + iv) - + iv) \ 

< \md,niu + iv) - md,n{ui + i?;)| + | m°d,^{u + iv) 
+ I md,n{ui + iv) - m°d^„{ui + iv) \ 


< 


1 


X — u — iv 


1 

X — ui — iv 


d^J,d,n{x) 


+ 


/ 


1 

X — u — iv 


1 

X — ui — iv 


^d°d,u{^)+V 


- j ^ + d°d,n)(.x) + V 

< ?;(4|mo| + 1). 


’mXui'u-i + iv) 


Next, we derive an upper bound on the integral 



md,n{u + iv) 


m°dn{v + iv) dw 


which tends to zero for n —> 0. For this aim, we decompose the integral into 



md,n{u + iv) — nfd j^iu + iv) \ du 

r 

\ vnd,n{u + iv) — m°d „(u + iv) \ du 
+ / I md,n{u + iv) 


'(-oo,uo) 


m^niu + iv) I du 


We can use the same arguments for both integrals and therefore only consider the first 
one. By Fubini’s theorem and the bounds on the support of ^d,n and 


'(-00,Mo) 
< 


< 


md,n{u + iv) — md^n{u + iv) \ du 


(-oo,«o) 


X — u — IV y — u — IV 


dw dfid,n{d:) d/Xd,n(2/) 


< 


[ [ [ 7— 1/4^ d7id,„(a;) d^x^ (y) 

J J Ji-oo,uo) [u-v^/'^uo)^ 

J J \x-y\ d/Xd,„(a;) dMd,„(y) 


(1 - v^/*)\uo 
- (/ la^ldMd.n(a^) + I \y\dii°d,niy) 
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,1/4 


< 2 - 


1 — lil/4 

Now, by Lemma C.ll we conclude 

dL{lJ-d,n, S°d,n) < ^ / I + iv) - rn°^^^(u + iv) I dw 

fv 1 2 

<2J- + -KI(4KI + I)i^ + -^- 

V TT TT TT 1 — 

where the inequalities hold almost surely for all d sufficiently large. Hence, 

dL{Sd,n, lJ‘d,n) ^ ^ 

almost surely as d —)■ oo. Lemma C.13 yields finally /id^„ — 0 a.s. 


□ 


5.8. Proof of Corollary 3.2 


As afore-mentioned to the corollary, 

,,o _ ,,MP , r 

H'd.n ~ dii EO * O g.2 i-PQ ■ 

" ’ 0-2 Po 

Therefore, by the representation (2.1) of the Marcenko-Pastur distribution we deduce 

,,o _. ^ X 

l^d,n ^ 0_ i-PQ ^2 ; 

y’PO PO 


such that 


h-d,r, 


d. 


MP . c 

★ 0 1-p 


Furthermore, if the left edge of the limiting distribution 


iVlP , X 

A, 

y’po PO 


is smaller than zero, then almost surely 


limSUpAmin(-d.n) < 0. 

d—>oo 

For y < 1 the left edge of the limiting distribution is smaller than zero if and only if 

Po < 1 - (1 - v^)^- □ 

6. Proof of Theorem 3.3 

We will show Theorem 3.3 by means of the next proposition. The proof of the proposition 
is postponed to Appendix B. 
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Proposition 6.1. Let {X{i,k))i^ken be a double array of iid centered random variables 
with unit variance and finite fourth moment, and denote by Xd,n £ its d x n 

submatrix in the upper left corner. Moreover, let {Ad,n)d,n, Ad,n £ be a sequence of 

symmetric random matrices and {Bd,n)d,n, Bd,n £ be another sequence of random 

matrices such that {Ad n,Bdn) O'^d Xd „ are independent. Let d, n —)■ oo and d/n y > 
0. If 

limsupmax \Aij d n\ dn (6-1) 

d^oo hi ’ ’ iA 


for some absolute constant a > 0, then 


lim sup 

d—^oo 


® ^d,n) i^d,n -^tZ,n) ) 

n ' 


<a{l + y/yf a.s. 

Soo 


( 6 . 2 ) 


Proof of Theorem 3.3. By Weyl’s inequality, we obtain 


-^max ^ ^ ^ ((-^cZ,n ^ ^d,n) {^d,n ^ ^d,n) ) ^ 

H“ '^min ® ® ^d,n'} {^d,n ^ ^d^n) 

^ -^max ® ((-^d,n ^ ^cZ,n) (^fi,n ® ^d,n) 

^ -^max ^ ^^d,n ® {{^d,n ^d,n) {^d,n ® ^d,n) 

“1“ '^max (^^^d,n ® ((-^rf,n ^ ^d,n) {^d,n ^ ■ 

•^min ^ ((-^rf,n ^ ^d^n) {^d.,n ^ ^d,n') 

H“ '^min ^ (^^^d,n ^ (^{^d,n ^ ^d,n) {^d,n ^ 

-^min ^ ((^d,n ^ ^d,n'} i^d,n ^ ^d,n) 

^ -^min ^ ^d,n^ i^d,n ^d,n'} 

(^{^d,n ^d,n^ {^d,n ^d,n') ) j ■ 


Because of 


-^max ® (^{^d,n ^d^n) i^d,n ^d,n) 
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— '^niax ( {'^d,n'^d n) ® i{^d,n ® ^d,n) {^d,n ® ^d,n')'} ^ ^dxd 

V”- Po 


+ diag 


(y^d,n '^d,n’d^d,n) ® ii^d,n ® ^d,n} {^d,n ® ^d,n) 


1-PO 2 


0^1 


Po 


dxd 


and 


diag 


{}^d,n 'dJd.n'dJfi yj) O O €d n) (^d,n ® ^d,n) ) 

n . / V 

—>■ 0 a.s. as d —>■ oo 


1 -Po 2 


Po 


cr Idxc 


by the Marcinkiewicz-Zygmund strong law of large numbers (cf. Lemma B.25 in Bai and 
Silverstein (2010)), we obtain again by Weyl’s inequality and Theorem 1 of Bai and Yin 
(1993) 

Amax (-Wd,n O {{Xd,n O £d,n) {Xd,n O Ed.n)*)) ^ — (1 + y/vf - - - 

\n ^ ' J Po Po 

With same argument, 

'^niin ^ ((Af(i,n ^ ^d,n') {Xd^n ^ ^d,n) 

In order to finish the proof, it suffices to show that 

^ ((Xd^n ^ £d,n) (Xd^n ^ ^rf,n) 

But this is an easy consequence of Proposition 6.1 since by (A.12), 


a.s <7 , „2 1 Po 2 

— (1 - v^)-• 

Po Po 


0 . 


lim sup max 

n—¥oo 




0 . 


□ 


Appendix A: Proof of Proposition 4.1 

A.l. Step I: Modifying e^i^n 

By tightness of (/r“‘^’") we have for any d > 0 a constant po > 0 such that for sufficiently 
large d S N 


#{Pi,d,n < Po} < dd. 
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We replace the niatrix ed,n by ed,n, where eik,d,n = £ik,d,n if Pi > Po and otherwise £ik,d,n 
is a Bernoulli random variable with P(eife = 1) = po such that the entries of are 
independent and jointly independent of Yi d^n, •■•;dn,d,n- be the matrix as Td^n but 
relying on the missingness matrix ed,n in place of £d,n- Since by Theorem C.8 

dx i rank('fd_„ - fd^^i) < <5, 

we may assume subsequently Pi^d,n > Po- 

A.2. Step II: Removing }^Wd,n o ({Md,n o ed,n){Md,n o 

Let 

Td,n = Td,„ - -^d,n O {{Md,n O £d,n){Md,n => £d,n)*^ ■ 

First note that 

P ( min#A/ij = 0 ) < d^maxP(#A/ij = 0) < d^(l —Pq)". 

V 

Hence, by the Borel-Cantelli lemma we have almost surely for all but finitely many indices 
d ^ 

^ ® £d,n)i^d,n ^ £d,n) ^ — '^d^n'^d^n' 

Now, by Theorem C.8 we have 

lim sup dx (1 = 0 a.s. 

d—>-oo ^ ^ 

Therefore it is sufficient to prove dL(p^‘^’", p^"^'") —>• 0. In the next subsection, we refer 
to Td^n as Td^n- 

A.s. Step III: Truncation of Td,n 

By the tightness of the sequence (p^”^'") we have for any <5 > 0 a constant tq > 0 such 
that for sufficiently large d S N 

^{Ykk,d,n ^ 'To} ^ dd. 

Therefore, let Td^n — diag(ll'[Tji^c;^ 7 i ^ "rofTii.tZ.n; ll{-Lrf(i,tZ,n ^ "roj-Tdrf.d.n) and Td^n be 
the sample covariance matrix with missing observations built from the random variables 

Yi^d,n — Yd^n^i,d,ni ^ — 1 , . . . , 71 , 

while ed,n remains the same. Since again by Theorem C.8 

dx (^p^‘^'",p^‘^-"^ < 2 rank (td^n - Td,n'^ < S, 

it is sufficient to assume subsequently that the spectral measures of the sequence {Td^n) 
have uniformly bounded support. 
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A.4. Step IV: Truncation of X(i,n 

For 0 < 5 < I we truncate the variables Vfc.d.n at the threshold level 
a > Hence, let 




(A.1) 


and Td^n, yd,n, and Md^n be the matrices constructed by replacing Xd^n with Xd,n = 
iXik,d,n) in Td,n, Yd,n, and Md,n- We have 




< - rank(rd_„ - Td,n) 


= — rank 
a 


= — rank 
a 


< — rank 
a 


^ (((^^,n ^d,n) ^ ((^^,n ^d,n) ^ ^d^n) 


- {{Md,n - Md,n) O £d,n){Yd,n O £d,n)*) 


H—: rank 
a 


^^d,n ^ (((^^,n ^d,n') ^ ^d,n'){_i^d,n ^d,n) ^ ^d,n) 


{^d,n ^ ^d,n){{^d,n ^d,n) ^ ^d,n) 


^^d,n ^ ({^d,n ^ ^d^n'Ji^d^n ^ ^d,n^ {^d,n ^ £d,n){^d,n ^d,n'} 

n \ 

(-^rf,n ^ ^d,n){^d,n ^ ^d,n} i^d,n ^d,n}{^d,n ^ ^d,n'} 
{^d,n ^ ^d,n){^d,n '^d,n) H“ {^d,n ^d,n^{^d,n ^ ^cZ,n) 

~^^^d,n ^ ^d,n) ^ ^d,n'}i^d,n ^d,n) 

“t“ {^d,n ^ ^d.n^iiy^d^n ^d,n') ^ ^d^n) 

- {Md,n O ed,n){{yd,n “ >d,n) O Ed^nT 

- {{Md^n - Md,n) O ed,n){yd,n ^ €d,nT 

- {{yd,n - yd,n) O Sd,n){Md,n ^ Cd^nT 

{yd^n ^ ^d,n){{^d,n ^d,n) ^ ^d,n'} 

1 

n 


< -# < i G {1,... ,(i} : ^ l{\X^k,d,n\ > > 0 


k^l 


< ^ ^ H\Xik,d,n\ > 


i.k 


(A.2) 

(A.3) 
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where inequality (A.2) follows by the simple observation that the i-th row respectively 
the i-th column of the matrices 



^d,n) ^ (^{^d,n ^d,n) ^ ^d,n^ 

and 

(^{^d,n ^d,n) ^ {^d,n ^ ^d,n') 

respectively 

(j^^d,n ^d,n) ^ ^^,n) ^ 

and 

{^d,n ^ ^d,n^ (^{^d.n ^d,n) ^ 

is the null vector if 

TL 


>nl/2d“-l/2) =0. 

k^l 

Next we prove that 

^ J2 ^\X^k,d,n\ > ^ 0 

i,k 


as d —>■ oo. Note hrst that by Markov’s inequality 

Var (l{|Xn,d.„| > < El{\X,k,d,n\ > ni/2d“-i/2} 

< (A.4) 

Using {A.4) in (A.5), and (A.4) in Bernstein’s inequality in (A.7), we conclude for suffi¬ 
ciently large d and some constant /3 > 0 




k.i 


> > d 



= P ^ _ El{\X,k,d,n\ > 

\ k,i 

> d^-^ - ndEl{|Xii^d,„| > 
< p( Y.il{\X,k,d,n\ > _ El{\X,k,d,n\ > 

\ k,i 


> d 


1-8 


rf2(l-a) 


(A.5) 
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\ k,i 

> (A.6) 

< exp {—/3d^~^) , (A.7) 

where inequality (A.6) holds since a > (1 + 6)/2. So, by inequality (A.3) follows 

dx 0 for d —>■ oo. 

Note that Xd,n is not centered and standardized, but by Cauchy-Schwarz inequality and 
Markov inequality, 

jPAj/ijc/yjl = '^Xi}z^d,rL 

< ^Jn\X^Kd,u\ > n^l^d^-^l^) 

< n-A2di/2-« (A.8) 

and moreover, XsLr{Xik,d,n) t 1 as d —)■ oo. In the subsequent section we redefine the 
matrix Xd^n by Xd^n and keep the initial notations. 


A.5. Step V: Replacing the normalizing matrix n ^Wd, 


Let 

^d,n ~ ^ ® ^rf,n)(Prf,n ® ^d,n) ^ 

kbtZ.n ® ^{^d,n ^ S^d,n)(Ld,n ^ ^rf,n) ^ 
kbtZ.n ® ® ^d,n'){^d,n ® ^d,n) ^ • 

By Theorem C.9, the elementary inequality 

tT{{C + Df) < 2tr{C‘^ + D^) 
for symmetric d x d matrices C and D, applied to 

C=^{Wd,n-Wd,n)o({{Yd,uOed,n){Yd,nOed,n)*)), 

D = -^{Wd,n - Wd,n) O (^{{Md,n O £d,n)iYd,n O £d,n)*) + {{Yd,n O £d,n){Md,n O ^d.n)*)), 
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as well as the inequality 

tr[(^ + ^*)^] < 4tr{AA*) 
for any matrix A with real entries, we deduce 




1 

- 


{Wd,n - Wd,n) O ({{Yd,nOed,n){Yd,nOed,nr) 


2n 


2 

- d*’' 


((-^cZ,n ^ ^d,n){^d,n ^ ^d^n) ) ((^^,n ^ ^d,n'}i^d,n ^ ^d, 

hWd,n - Wd,n) O (^{iYd,n O ed,n){Yd,„ O ed,„)*)) 


: tr 


{Wd,n - Wd,ny O ( {{Md,n O ed,n){Yd,n O Ed.n)*) 


^ (^{Yd^n ^ ^d,n){^d,n ^ ^d,n) 

' J 

=: hd^n- (^-9) 

We prove that hd,n —> 0 a.s. as d —)■ oo. Thereto, define for an arbitrary constant 


7 > -v/da + 7 

the event 

Ad,n = |v 1 < i,J < d : \{md,d,n)-^ - (W'y.d.„)-1 


< 7 


logn 


(A.IO) 


(A.ll) 


Then, for sufficiently large d the union bound and Hoeffding’s inequality yield 


¥{Ad,n) = 1 - min) 


> 1 — max 1 


> 1 — 2d^ exp ( — 


imj,d,n)~^ 

7 ^ logn^ 







By the Borel-Cantelli lemma all but hnitely many events „ almost surely occur. Hence, 
if 1a^ n^d,n —t 0 a.s. for d —)■ oo then hd,n —> 0 a.s. Note furthermore that on the event 
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{Wij^d,n) ^ {Wij^d,n) ^ 


{W.j,d,n)-^ - 


< 


< 


{W,J,d,n)-HW^J,d,n)-^ 

7 A/(log n)/n 


< 


_ lV0-Ogn)/n _ 

inin*P?,d.n (min»Pi.d.n “ 7\/(logn)/n 

27 /logn 


V n 


(A.12) 


for d sufficiently large. Now we prove that „^d,n —t 0. In order to save space the 
explicit dependence on d and n is suppressed in the displays until the end of the section. 
By inequality (A.12), we have 




where 


min, »?dn^ 

* i,i=l 


EM E ^ik^jk^ik^jk J ^ ^ ^ik^jk^i 


k^jk 


< 


87 ^ log n 
mini Pidn^ 


:j=i \ \k=i 
d n 


\k^l 


E E l^YikYjkYiiYj 


ji\ 


, i,j — l k,l—l 

n ^ 


fc,Z=l 


= /l+/2, 


/l = 


87 ^ log n 
mini Pidn^ 


d n 


E E 

i,j—l k,l—l 


00 2 1 d Tl 

/2 = i^^E E 


mini p^dn^ 


^ 1 fc,Z —1 


and 
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For the first term we obtain by (A. 8 ), (2.2), uniform boundedness of the entries of Td^n, 
and (A.l) 


h = 


87 ^ log n 
min, pfdn^ 


d n 


d n 


Y, E mkY,kYuY,i\+Y E 


, i,j—l k,l—l 


i=l k,l — l 
k^l 


d 




< 


i—\ k—1 

logn logn dlogn logn 


i,j = l k—1 


ndd^a-l ^ ^2 ^^l-2a 

^ ndi-2« 

Recall the definition (4.1) of Md,n- Using again the bound 


\Wu\ < 


1 


< 


on the event A 


for d sufficiently large, we get for the second term with the same type of arguments 


(A.13) 


/9 = 


247 ^ log n 

min,- p^drA ^^ —' n 

* i^j — lki^k‘2,k^,k4^ — l 


E E IE 2 ^3^2 ^^3 '^ik4^^ik\ ^ jk\^ik2 




min,- p}‘^dn^ ^ 


k2,k2k4 — l 


< 


logn 

dvP 


E E 


E 


|EFifci Uifca Yik3 Yiki I 


. i—1 \ ki,k2-,k^,k4 — l /ci,/c2,^3 ,/l4 = 1 

fci5^fe2#fc3#fc4 -^(ki^k^^k^^ki) 


E E 


E 


Yjk2^ik^^ik4 


i^j — 1 \ /ci ,^2 ,^3 ,^4 = 1 /ci ,^2 ,^3 ,fc4 = l / 

kiy^k2y^k^^k4 ^(ki^k2^k3^k4) 

logn ^ 

j3—4a„2 I j2a„4 i ^4—4a_2 i j2„3'\ 


^ dn^ " 

^ rfZa-llog^ _^ ^ 


n + d n 


We need a sufficiently tight bound on the variance of hd^n^Ad,^ in order to conclude by 
the Borel-Cantelli lemma that in addition hd^n^Adn 0 almost surely. Thereto, define 


Gij^d.n — ^ (^^Yij d^n 5 bj 


= 1 , ...,d. 
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Using (A.12) in (A.15) and dropping those summands of (A.14) whose indices satisfy 
{h,3i} n {* 2 , 72 } 0 , we get 


Var h\A 








I + 8 (' ^ 

\ / / 


E +8( Y. ]ia\ 

\ \ feGTVi,,, / \ feGA7„,, / / J 


^2 E ^{Glnin E ^nkY,,A +8( ^ M,,kY,,k] J1^ 

/ \ fcGTViiJi / / 


d 2 


*1,42 J l J 2 = l 


. fcGTVi, 


(A.14) 


4^L2f2f ^ y.,fey,,fe) Asf ^ m,kY,,k) )i2i| 

I \ VfcGTVioi, / \kGAfi^j^ / / J 


< 2-Sniog..); ^ Ejff y Y,,,Y,S+( Y. 


minpj^^d^n® 


il.i2 .il J2 = l 
{o,ll}n{i2 J2}5^0 






'y ^ Yi2kYj^k 

. jr, / 


+ d2 E El^Ell E 


*l.*2,il j'2 = l 
{o,li}n{i2 j2}=0 




X G 


^232 ' 


E 


j2k 


. keJ^i2 


I + E M,2kY,2k 1a 

\ feGTVijjj Y '' ' 

(A.15) 

1 + I E/ ^iikYj^kj I 

) + (^ E 

(A.16) 

2 / \ 2 s ^ 


d 2 


E E E + E )ia 


»i 42 Jl J 2 = l 
{*i,li}n{i2 j2}=0 


V feGA5, 


V fcGA^i, 
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where Ji consists of the term (A.15) and I 2 of (A.16) and (A.17). The term Ii yields 


h ^ ^ 1,1 + h,2 + I] 


1.3) 


with 


h,! = 


Il,2 = 


Il,3 = 


(log^ 

cPn^ 


0-Ognf 

(Pn^ 


0-Ognf 

(PrP 


E 

}n{i 2 , 

d 

E 


il,i2 J'l,i2 = l fcl,fe2,fc3.fc4 = l 

{*i,ii}n{i 2 .i 2}#0 


n,*2 01,12 = 1 fci,fe2,fc3.fc4 = l 

{*i,ii}n{i2,i2}#0 


E 


41,42,11,12 = 1 fci,fe2,fc3.fc4 = l 
{4i,ii}n{42,i2}#0 


E |E^nfci>Sifc I^lfc2^lfc2 ^2^43^2*3^2*4 ^' 2*4 I ) 

,* 3 ,*4 = 1 

n 

^ ^ -^42 *3 ^2 *3 -^42 *4 ^2 *4 

“ .*4 = 1 

X ^4l*l^jl*l£4l*2^il*2^i2fe3^i2*3^42*4^l2*4 | ) 

n 

E/ ® (-^41*1 ^1*1 -^41 *2 ^1 *2 -^42 *3 ^2 *3 -^*2 *4 X 72 *4 

“ .*4 = 1 

^ ^41*1^11*1^41*2^11*2^42*3^12*3^42*4^12*4 ^ll^ 


For Il l we have 
^ (logn )2 


'y ^ y ^ I®^1*1^1*1^1*2^1*2^2*3^2*3^2*4^2*4 I 


41.42,11,12 = 1 *1,*2,*3,*4 = 1 

{4l,ll}n{42,i2}5^0 
4l/ilV42#j2 


< 


d?"rp 

*(logn)^ 


2 

2fc4 


i — 1 fci ,/C2 ,^3 ,^4 = 1 


where we used for ii, 7 i,* 2 ,j 2 with {ii,ji} n { 12 , 72 } 7 ^ 0 and A 7 ^ ji or 12 7 ^ j 2 the 
bounds 


|IEbij^fej^yj7/jj^yij^fe2yj7fe2^2*3^'2*3^2*4^'2*4 I ^ 

and for i = =71 = 12 = 72 the estimates 


^ 2 d‘ia -2 #{fci, ^2, ^3, ^4} = 1 

nd?°‘~^ for #{fci, ^ 2 , ^ 3 , ^ 4 } = 2 

1 for #{fci, A: 2 ,A: 3 ,fc 4 } = 3 

n~‘^d?~‘^°‘ for #{fci, A; 2 ,/ca, fc 4 } = 4 


\y 2 -^2 -^2 < 

^^i*l-*^4*2-*^4*3'' 4*4 A 


{ ^3^6a-3 
^2^4a-2 

nd'^°‘~^ 


for #{A:i,fc2,A;3,A:4} = 1 
for #{A:i,fc2,fc3,^4} = 2 
for #{A:i, fc 2 , fcs, ^ 4 } = 3 
for #{A:i,fc2,fc3,^4} =4. 
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These estimates are deduced by the following consideration. First, the expectation is 
factorized by independence into a product of moments of the Yik’s. Then applying (A.l) 
and (A.8), the 1-th moment is bounded by 

, I eN. 

Now we evaluate Ii^ 2 - Using (A.13) in (A.18) 


U ,2 = 


(log^ 

(Pn^ 


E 


j'2 = l 

{*1 ji}n{i 2 j 2}#0 


^ ' I®^lfcl^'lfcl^lfc2^'lfc2^'2fc3^2fc4^2fc5^2fc6 | 
1 


< 


< 


(logn)' 


n 

^ ^ I®^lfcl^'lfel^lfc2 ^'1*2 ^2*6 ^' 2 * 3 ^ 2 * 6 X 12^4 I 

(A.18) 


E 


U)*2,ii,i2 = l ki,...,kQ — l 
{^lOl}n{ 22 ,j 2}?^0 


(logn) 2 d 6 « 


where we used for the bound 


lEdii fcj X/l *1 ^*1 *2 X/l *2 ^*2*5 X/2*3 ^* 2*6 X/2*4 I 


/ 7 \ ^ — 4 

(n) for i = #{/cl,/^2,fc3,fc4,1^5,fc6}■ 

Again by (A.13), we obtain with the same argument as for Ji _2 


< - 


^ 1,3 ^ 


< 


(logn)^ 


^ ^ ^ ^ l®^lfc5^lfe6Xj'l*lXj'l*2^2*7^2*8Xl2*3Xl2*4 I 


(log 

PP 


*l42.ilj2 = l *l,...,fc8 = l 

{il,ll}n{i 2 J 2}#0 

2 J 6 a 


with 


|E^l*5^lfc6Xj'l*lXj'l*2^2fc7^2fc8 Xj'2 *3 XI 2 *4 I 




2 — 4 


d^“(4 for Z = #{fci,fc 2 , 1=3, ^4,^5, ^6,^7, fcs}- 







50 


As concerns I 2 , define 



and note that Uij^d,n is bounded by a constant multiple of because Nij^d,n 

contains at most n elements, d « ^ 1 since by Subsection A.l min^pi is uniformly 
bounded away from zero, \Yik,d,n\ ^ by Subsection A.3 and Subsection A.4, 




M... X! 




Hence, 


d 2 


® ^*2^21^) E (Cfilji 1a) E 1a) 


*1,*2 J'l,i2 = l 
{*1 ji}n{i 2 j 2}=0 


1 ^ 

^ ^2 I — E([/i^j^l7i2j2lA=)+E(17iijilA=)E(17i2j2) 


nh 2 .ji j 2 =i 
{il Jl}n{i2 J2}=0 


■E 


) E {Ui2j2 Iac) ~ E Iac) E {Ui2j2 Iac) | 


— A2 E ) E (17^2^21^“=) 


d? 

*l,i 2 .jl J 2 = l 
{*1 Jl}n{i 2 J 2 } = 0 

< ^ 12 + 8 a- 27 ^^ 


Note that by choice of 7 in (5.40) the exponent in the last line is strictly smaller than 
— 1. Therefore by the lemma of Borel-Cantelli hd^n^A^ 0 almost surely {d -A 00 ). In 
the following subsection we redefine the matrix Td^n by Td^n- 


A.6. Step VI: Removing 

o {{Y o e){M o s)* + {M o s){Y o s)*) 

By the same arguments as in Subsection A.4 we return to the original centered and 
standardized matrix Define 

Td,n — ^{Yd^n ^d,n){Yd^n ^ ^d,n') ^ ■ 
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We prove that 

almost surely. For 7 > 1, define the event 


Ad,n = \Nii^d,n - npi,d,n\ < l\/n lognj . 

Note that 

^m.ayi\Nu^d,n - npi^d,n\ < 71/71 lognj = < max 


^ ^ {^ik,d^n Pi,d,n) 


k=l 


< 7 \/n logn 


for d sufficently large. The union bound and Hoeffding’s inequality yield 

P (i/„) < 2dn-^^' (A.19) 

By the Borel-Cantelli Lemma all but finitely many of the events {Ad) occur. Moreover, 
for i < ?7 < 1 define the event 


Bd,n = S ^ 1 


\m^dn\ > 


. i=l 


dSA-v) 


<dp\. 


First observe that by the same type of argument as used in (A. 13) and by Markov’s 
inequality 


maxP \m^^d,n\ > 


d?A-v) 


: Ad,n 


< maxi 


nmmpi^d,n 


^ ^ ^ik^dj-rSAik^d^n 


> 


d^A-A 

n 


< 


^ik,d,n^ik,d,n) 
9 . 2 


< d^^- 


where we have used 


1 


< 


Nu^d,n nrRmpi^d,n 
for d sufficiently large in the first inequality. In particular, 

fdH^) 


El < \mi^d,n\ > 


= El < \'m^^d,n\ > 
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for some suitably chosen constant k > 0. We conclude for d sufficiently large by Hoeffd- 
ing’s inequality 


<p El' 


‘dli d,n ^ 


, i=l 




- El < Thidn > 


d 2 (l-r;) 


> d’' - AC ^ 


< p[ Elf > 


d 2 (l-';) 


, /d 2 (i-';) 1 

- El <( mi^d,n > \ - > > 

'' n 2 


< exp — 




-1 


By the Borel-Cantelli Lemma all but finitely many of the events (Bd.n) occur. 
Let 7 ' > 0 be an appropriate constant such that for all n 


2'^E\Yik,d,n\ < I'n. 


Then, define the event 


Dd,n = ^ E 1 1 E \^ik,d,n\ > l'n \ < 


.1=1 kfc=i 


logd 


In the next step we shall prove that P(limsup,^ DJj „) = 0 in order to remove the corre¬ 
sponding rows from the matrix Y. By Chebychev’s inequality we have 


maxP ( E %k,d,n\ > in ) 

* V^i / 

( n n 

E 

k^l k^l 

< maxP I^E l^*fc,d.n| - ^Yik,d,n\ > 


\k^l 


K 

< — 
n 


for an appropriate constant k' > 0. Again, by the Hoeffding inequality for sufficiently 
large d, 

P(7^S.J<pfEl|E|l^Wd.«| >7'n|-El|E|y,fc,,,„| >7'n| > ^ 


, 2=1 Kk^l 
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- ^ ^ \'^ik,d,n\ > - El \Yik,d,n\ > I'n^ > 

d 


< exp — 


2(logd)V ’ 

and therefore P(limsup£; D'^ „) = 0. Now let 

Td^n — [ {Yd^n ® ^d,n^{Yd^n ® ^d,n) {Yd^n ® ^d,n'){^d,n ^ ^d,n) 


where 


and 


b^ik,d,n — ^ik,d,n^ A |'^z/i:,d,n| ^ 


(^b\dd^n ® ^d^n^(Yd^n ® ^d^n) J : 

(i2(l-r/) 1 


Yik^d.n — hi/c,tZ,nlf ^ |hiZ,d,n| ^ T ■ 

By Theorem C.8 and due to P(limsup£;(T>^ ^dn)) =0 conclude by the same type 
of arguments as in Subsection A.4 


db (, 




~ d ( ^^d,n ^ ^(Prf,n ^ ^d^n^{d^d,n ^ ^d,n) “t” {d^d,n C ^d,n')(Yd^n O Sd^n) 


(Yd^n ® ^d,n){^d,n ® ^d,n) {d^d,n ® ^d,n}{Yd^n ^ ^d.n) 


tl.S. ^ 7 

—>■ 0 as a —?► oo. 

In order to save space the explicit dependence on d and n is suppressed in the displays 
until the end of the section. By Theorem C.9, 


< i tr ( (^W o ((f o s){M o e)* + (M o £)(f o 


(A.20) 


X ( —W o ({Y o e){M o e)* + (M o e){Y o e)* 


; tr ( o ((M o e)(f o e)*{Y o e){M o e)* + (f o e){M o e)*(f o e)(M o e)*) 
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4 / 

1 T /, ~ 

< -tr 

0 ((M 

-d ^ 

n2 

I"C 

VI 

mjl < \mi\ < 

2=1 

[ 


x' 


(A.21) 


d2(l->7) 


E -2K E 1 E^ 


i=i 


< 


rf 2 (^-l) 

dn^ 


d d 


\k^l 

2 


. / = 1 


EE E Sik^jkYjk I 1 "I 1^7 I < 7 ^ r I 


(A.22) 


2=1 J = 1 \fc = l 


. Z = 1 


where we have used the elementary inequality 

tr(C'2) < tr(C'C'*) for any C e 

in (A.21). It remains to prove that the last line (A.22) converges to zero almost surely. 
Let ?7 < 77 ' < 1, and rewrite 

maxP (e(e ^ik^jk^jk ] ijEii^,7i<yr7|> ' 

\j=i \fe=i / 


1=1 


d 2 W-l) 


d / n 


= maxE <1 P I 51 E ^^kejkYjk M E ^ 2 (^- 1 ) 

yj=l \k=l ) 


. 1=1 


(A.23) 


Define for rj' < 77 " < 1 the random variables 


^ij,d^n — 1 


5^ ^il,d,n^jl,d,nYjl^(i,r. 


1 = 1 


> , 1 < i,j < d. 


Then by Markov’s inequality for the conditional probability and an appropriate constant 

k” > 0 , 


nhj\e) = 




1=1 


> 


'J nd 2 (''"“i) 


^ E/ 


1=1 


nd?W-i) - (i 2 (^"-i) ■ 


The inner conditional probability in line (A.23) can be further estimated by 


’ d / n 

EE SikSjkYjk ] 1 S E 


\fc=l 


/ I 1=1 


Yji\ <^11.} > 


d 2 W-l) 


<P I (7'n)2 5]l|y^y''"-i) < 


E! ^il^jlYj 


1=1 


< jn > > 


2 d 2 (')'-i) 
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P I 1 

i=i 


'Y/ 


1^1 




2(i2(r;'-l) 


where the last conditional probability disappears for d sufficiently large. For the first 
probability on the right hand side, we obtain 




i=i 




< y'n > > 


<P (yn)2^(/y-E(/,,|e)) 


> 


n 


— HI 


2d2(’v'-i) 


t=i 


2rf2(r,'-l) rf 2 (V'-l) 


< 


’ a 

Y{hj-E{I,,\e)) > 


,3 = 1 


4'y^2^2(?7^ —1) 


for d sufficiently large. Finally, by Hoeffding’s inequality the last line is bounded by 

“P (-8yi,'-=) ■ 

Altogether, (A.22) is bounded by ^ with probability 

'^2(^-1) 


1 -: 


dn'^ 


E E E 1 E ^ R 


2=1 j—1 \k—l 
d / 


. 1^1 


> 1 — d max P EE Sik^jkYjk I 1 < E — 7 ^ C ^ 


\fc=l 


. l=l 


d2(r;'-l) 


By the Lemma of Borel-Cantelli, 




almost surely. Consequently, 

^ 0 as d ^ oo. 

Subsequently, we denote „ by Td „. 
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A.7. Step VII: Diagonal manipulation 


Rewrite the matrix Td^n in the following way 
1 

-I 
n 


— i^d,n'dJd^n) ® ° ^d,n')0^d,n ^ ^rf,n) ^ 


— diag 


^ {’^d,n'd^d,n) ^ ^ ^d,n){^d^n ^d^n) ^ 

~lVi,n ^ ^ ^d,n')i^d,n ^ ^d,n') ^ 


In this step we replace the diagonal matrix 


Sd,n ■ = diag 


-{Wd,nW*d^n) ° [(Xd,n O £d,n)iYd,n O £d,n) 

£’ ^(d^,n ^ £d,n}{Yd^n £’ £d,n) ^ 


by its diagonal deterministic counterpart Sd,n with 

O _ rj ^ ■ _ 1 7 

Pi,d,n 

Thereto, we use similar arguments as in the last subsection. In contrast to the last 
subsection we cannot simply rely on Markov’s inequality since Yik^d,n is assumed to 
possess only two moments. In order to save space the explicit dependence on d and n is 
suppressed in the displays until the end of the section. Note that for any m > 0, 


ttmax = max i 


= max JJ 
2=1,...,d 


< max I 


^ Sii Sii 

1 -Pi 


npi 


fc=i 


T,[yY-?-T, 


Pt 


> u 


^ - Pi rp Pi 

^ a 
Pi 


n 


2 ^ik 

np. — 




> u, 


^ '^ {,£ik Pi) 


k=l 


> a /n log 7 


■ max P 

i—l....,d 


Pi rji 1 - P* 

d- ii 


n 




Pi 


np, — Pi 


> u, 


E(^ifc ~P^) 




< \/n log n ). 


As concerns the first term in this last inequality, Hoeffding’s inequality yields 


max JJ 

i—l,...,d 


^-Pirj. 1-P* 
d- ii 


Pi 




"PI S Pi 


> u, 


E(^»fc ~pi) 




> yn log n 
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< max I 


< 2n"^ 


^ Pi) 




> y/n log n 


In order to bound the second term, note that 


max i 


= max 

i—l,...,d 


^ - Pi rj. 1 - P* 
-L ii 


Pt 

[npi + y/n log n] 

E : 

I— I "npi — y/n log n] 
\npi + y/n log n\ 


E Y2 ^ 

P^ 

K — 1 


= max 

i—l,...,d 


E 


l— I "npi — y/n log n] 


> U, 


- Pi) 




< n log 1 


1 - P* T. 1 - Pi 

ii 




P» 


np, — p, 


> u, y^ eik = l 


k=l 


1 - Pi rr. 1 - Pi 


It — 

j- 47. 




P* 


np. — p. 


> u 


^ ^ ^ik — ^ 




[npi + \/n log nj 


= max 

2=1,...,ci 


E 


l—\npi — -\/n log n] 


P* “ npi Pi 

k—1 


X P I^^Eife = ij 

> MI p I y^£jfc = z 


^fc=l 


(A.24) 


where the last identity holds true because Fii.d.ni • • • ? Yin,d,n are iid and jointly indepen¬ 
dent of ed,n- By the elementary inequality 


1 >"2 

Tu--Y.— 


< 


\npi-y/n logra „ 

Pi 


\npi-y/n 

Tu-- Y. 


fc=l 


[npi + ^n log raj „ 

Tu-- Y — 

n 

fe=i 


Pi 


we conclude 


(A.24) < max I 

i=y...,d 


max 

2=l,...,d 


rnp,-Vnlognl „ 

^ ~ P» y _ ~ P» ^ 

Pi “ npi Pj 

/c=l 


> U 


, [npi+^/n log n] , 

P^Tu - Y — 

Pi ” i^P* P* 

/c=l 


> 


< 2 max 


. , L"PiJ 

^-Pirp 1 “ Pi v2 

Pi m 


> 


1 -p» 

np2 


[ \/n log n I +1 


E 


/c=l 


ifc > 2 
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< 2 max 


1 ~ Pi 1 — Pi rr v'i 

d-ii n y 


Pi 


npl 


ik 


fc=l 


U 

^ 2 


2T,,(l-pi) / /logn 2 


wpf 


For n sufficiently large, the last expression is bounded by 
l"PiJ 


2 max 


1 


Vnpi\ 


E - 1 ) 


> 


UPi 


4Tu{l-pi) logn 


4(T,, V 1) 


upt 


(A.25) 


Note that by Subsection A.3 and Subsection A.l 

lim inf min — Pi,d;n — ^ q j i i _ qq^ 

d—foo i—d n V 1 d—loo ’ ’ 

Hence, by the weak law of large numbers (A.25) converges to zero as d —>■ oo which 
implies amax —> 0. Now, with ai =F — Sul > , z = 1,..., d, 


E 1 { ^li- Sii > u| 


. 2=1 


^ u \ ^ 2,d\ Qlrnax V A / , 






a 

d 


Q.. _ Q.. 


Q.. _ Q.. 
^11 


>■ CXi ^max ^ \j ^ ^^max 


•} 


> u> — ai > d* 


< exp (^—2Vd^ , 

where we used Hoeffding’s inequality in the last line. Therefore, 


E 1 - ‘S'ji.d.n > 0 


as d —>■ oo. Let Sd,n be the diagonal matrix with entries 


''22,d,n — 


Sir I 




< U 


} 


We conclude by Theorem C.9 and Theorem C.8 that almost surely for sufficiently large 
d 


dh ( 


ll'i'd,n ,.Td,n — Sd,n-\-Sd,-. 
fd , ^ 
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< Sd,n + Sd,n ^ Sd,ri + Sd,n'^ 


< - 
- d 


1 

“ d 


1 f I 

— rank ^Sd^n Sd^n^ ^ ^ ^ (ySa.d.r. 

V i=l 


1/3 


-s,, 


n ^ii.d.n 


a 

^ ^ 1 Sii^d,n Sii^d^n ^ “t“ U 


,2/3 


< 2^2/3. 


Since the constant u > 0 is chosen arbitrarily, we have 

for d —)■ oo. 


A.8. Step VIII: Reverting the truncation 

Reverting finally the truncation steps I, III, IV yields the claim. 


Appendix B: Proof of Proposition 6.1 

Define Xd,n G by Xik,d,n = Xikt{\Xik\ < 5d,n\/n}- By Lemma 2.2 (truncation 

lemma) of Yin, Bai and Krishnaiah (1988) for r = \I2, given any preassigned decay rate 
to zero, there exists a sequence {Sd,n), dd,n —t 0, with lower speed of convergence than 
that decay rate such that 


p (^Xd^n ^ Xd,n infinitely often^ = 0. 


Let iSd,n) be a sequence satisfying the truncation lemma with 


1 

7^ 


0 ( 1 ). 


(B.l) 


Therefore, 


limsup 

d—¥(x> 


-^d,n ® ((-^d,n ^ ^d,n) {^d,n -^d,n) ) 

~^d,n ^ ^ ^d,n^ (^^d,n ^ ^d,ri^ ^ 


= 0 . 
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Now let Xd,n be the random matrix with entries Xik^d,n = Xik,d,n — ^Xik,d,n- We prove 
~^d,n ® ^^Xd^n ® ^Xd^n ® ^d,ni^ ^ 

~^d,n ^ ^^Xd^n ^ ^d,n^ (^Xd^n ^ ^d,n^ 


lim sup 

d—¥(x> 


= 0 . 


As EXii = 0, note first that 


EX- 


11,d,i 


— I EXii — EXiil{|Xii| > 5„-\/n} 
= I EXiil{|Xii| > I 

< EXfin-3/2<5-3. 


(B.2) 


r)dxd 


Using the triangle inequality, the bound || • ||s^ < || • Hs^ as well as the inequality 

d 

IIC'IISoo ^ aiax / ICijl for symmetric C G 

j—l,...,d ^^ 

in (B.3), we conclude 

~-^d,n ® ® ^d,ri^ ^^d,n Bd,n^ 

~-^d,n ^ ^ ^d.n^ (^^d,n ^ ^d,n^ 

-^d^n ® ( (^^d,n Bd,n^ (^^d,n ® Bd,n^ (^Bd,n ^^d,ri 




< 

n 


E E ^ik,d,n^ik,d,n^jk,d,n^^jk 


\.j: 


(B.3) 


\k^l 


+ dmax \A,j d,n\ ( maxS^^^, ^ „ ) (EXn d, 
i,j \ ik ’ ’ / V 

< 2i/-^|EXii.rf,„|max|Ay^.n| (maxB,^^,^ 

M r> o o \ ik ’ ’ 


d max W XH, 


(B.4) 


fe=l 


+ dma.x\A^j^d,n\ (maxB,^fc^„ I (EXn^a 

ij \ ik ’ ’ / ^ 


0 a.s., 
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where the first summand in inequality (B.4) tends to 0 by (6.1), (B.l), (B.2) and the 
Marcinkiewicz-Zygmund strong law of large numbers (cf. Lemma B.25 in Bai and Silver- 
stein (2010) with (3 = 1 and a = 3/4). Since the entries of Xd^n have all the same finite 
variance and d « assume for convergence statements about 


® (^(^Xd^n ® bSd^n^ 


n ® 33d n 


that the entries of Xd^n to have unit variance. In order to apply the Lemma of Borel- 
Cantelli, we need to show that the probabilities 


-^d,n ® ^^Xd^n ® 33d^r^ ^Xd^n ® 33d^n^ ^ 


> za 


are summable over d G N for any z > {\ -\- By Markov’s inequality and because of 

ll'S'll^ < tr for any symmetric matrix S and Z G N, it is sufficient to show that for 
any sequence {ld,n) of even integers with 

^d,n/logrH -00 and logu ^ 0, 


(Ad,no[[Xd,nOBd,n) {Xd,n O Bd,n) ) 


we get 

md,nM.r. = Eti' 

where (l -I- < 77 < z is an absolute constant and Ed^n is the event 

Ed,n = i max \A^j,d,n\ ( max Bf,^dn] < a 1 • 

K ifj \ i,k ’ ’ / J 

We have by independence of Xd^n and {Ad,n, Bd,n), 


< (ary)''"'", 


'^d.n.ld n — ^ 


^—Id 


E E E 


liSd.n G ^*2*3 ‘ ‘ ' A. 




X E 


X h, Bj. 


Xi^f^-^^Xj^^ki ' ’ ’ Xi 


■ Bi, 


B, 






E E 

„=1 „=1 


E 


^iiki^i2ki ' ' ' ^ 
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for d sufficiently large in which case the inequality 


id 


E 


E 


= 1 


E 




■x,„ 


X., 


ilki. 


<ri‘ 


Id,, 


has been shown in the proof of Theorem 3.1 in Yin, Bai and Krishnaiah (1988). 


Appendix C: Auxiliary results 


Lemma C.l. [Lemma 4 w Couillet, Debbah and Silverstein (2011)] Let A G 
T G C and r G such that A and A + rrr* are invertable. Then 


r*{A-\-Trr*) ^ ~ 


1 

+ Tr*A~^r 


r*A-\ 


(C.l) 


Lemma C.2. [Lemma 2.6 in Silverstein and Bai (1995)] Let z G C+, A,Bg B 

Hermitian, r S M and q G C‘^. Then 


tr 


B - zidxd) ^ - {B + rqq* - zidxd) ^ 


A 


< 


Plli 


(C.2) 


Lemma C.3. [Lemma 8 in Couillet, Debbah and Silverstein (2011)] Let C = A + iB^ 
ividxd, with A, B G symmetric and B positive semidefinite, z; > 0. Then 


\o 


■-1 


I Soo - 


< V 


-1 


(C.3) 


Lemma C.4. Let Z = {Z\, ...,Zd) G be a centered random vector with components 
bounded in absolute value by some eonstant c > 0. Then for any p> 1, 


E|||Z||2-E||Z||2|^ < CPpP/^dP/^, 
E||Z||^^’ < CPpP^^dP, 


(C.4) 

(C.5) 


where the constant C > 0 depends on c only. 

Proof. The lemma is an easy consequence of Lemma 5.9 of Vershynin (2012) together 
with the Definition 5.7 of the subgaussian norm of Vershynin (2012), since 


-{\\Z\\l-E\\Z\\l) 


2 . d 

b2 “ i=i 


2||2 
i ll'02 
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where A corresponds to the absolute constant of Lemma 5.9 of Vershynin (2012), and 
2 1 1 2 




4’2 


^-E\\Z\\l + ^{\\Z\\l-E\\Z\\l) 


'02 


< 2 


E||Z|| 


■02 


(11^112-Ell^ll 


02 y 


< 2 + 


16A\ 




□ 


Lemma C.5. Let d/n < ci and Zi,...,Zn G be a sample of i.i.d. random vectors 
with centered and independent components of variance 1 and bounded in absolute value 
by some constant C 2 > 0. Denote the largest eigenvalue of the matrix by 

Ai. Then for any p> 1, 

EA? < C, 


where C depends on ci, C 2 and p only. 


Proof. Since 


ZkZl 


k=l 


-ZZ*, 

n 


where the fc-th column of the matrix Z G is given by Zk^Xi = s\ with Si the largest 

singular value of n~^^^Z. Dividing the right-hand side of inequality (5.22) of Vershynin 
(2012) by ^/n yields 


■Si < \/c\ + Ai -|- 


t 


y/n 


with probability at least 1 — 2exp(—A 2 <^) for some constant Ai, A 2 > 0 depending on 
C 2 only. Therefore, 

E\P = Esi^’ 

a;^^P(si > x)dx 

pOO 

< (i/cT -I- Ai)^^ + 2 / exp (-A 2 n(x - (i/cq -b Ai))^) dx 

J y^^+Al 

pOO 

< (i/ci -b Ai)^^ + 2 / (a; -b a/cT + Ai)^^ exp (-A 2 na;^) dx 

Jo 

<C, 

where C can be chosen independently of n. □ 
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Lemma C. 6 . Let Ui,...,Ud iid random C-valued random variables with KUi = 0, 
E|Lip = 1, \Ui\ < C for some constant C > 0 and A G Denote U = {Ui,Ud)*■ 

Then 

E\U*AU-trAf < cWAfs^d^C^"^ 

with a constant c > 0 which does not depend on d, A and the distribution ofUi. 

Proof. The proof follows the lines of Lemma 3.1 in Silverstein and Bai (1995) by replac¬ 
ing the logarithmic bound on the entries of U with C. □ 


Lemma C.7. Ford G N andn = Ud GN wit/i limsup^ d/n < ci < oo let Xi^di ■ ■ ■ ,Xn^d 
be i.i.d. d-dimensional, centered random vectors with variance 1 such that 


limsup max max \Xi^k,d\ < C2 

d^oo i=l,---,dk=l,...,n 

almost surely and Rd G be a positive definite diagonal matrix with 

limsup max \Rid,d\ < C3. 

rf—fCJO i—l,...,d 


Then, 


lim sup Aniax 

d—>-oo 


-Y.^dXk,dXldR 


1/2 


< c a.s. 


fc=i 


for some constant c > 0 depending on ci,C 2 and C 3 only. 


(C. 6 ) 


Proof. Since the random variables are uniformly bounded which implies uniform sub- 
gaussian tails, Theorem 5.39 of Vershynin (2012) applies. The particular choice t = logd 
yields 






<d_^c+^^ 

n n 


with probability at least 1 — 2exp(—C'(logd)^) for two positive constants C,C' which 
depend only on ci and C 2 . Hence, the claim follows by the Lemma of Borel-Cantelli. □ 


Theorem C .8 (Theorem A.43 Bai and Silverstein (2010)). Let A and B be two d x d 
Hermitian matrices. Then, 

dx ^ rank(A — B), (C.7) 

where and denote the spectral distributions of A and B, respectively. 


Theorem C.9 (Corollary A.41 from Bai and Silverstein (2010)). Let A and B be two 
d X d Hermitian matrices with spectral distribution and p^. Then, 

dl {p^,p^) < ^ti{{A-B){A-Br). 


(C. 8 ) 
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Theorem C.IO (Theorem A. 38 Bai and Silverstein (2010)). Let Ai, ■.. ,Xd and 6i,... ,6d 
be two families of real numbers and their empirical distributions be denoted by /i and fl. 
Then, for any a > 0, we have 



(C.9) 


where the minimum is running over all permutations tt on {1,, d}. 

The next lemma and its proof are essentially taken from Krishnapur (2012), Lemma 
34. Since the necessary dependence of (in his notation) <5 on y is neither mentioned in 
his statement nor its proof, we include a proof for completeness. 

Lemma C.ll. Let p, and v be two probability measures on the real line and and 
their Stieltjes transforms. Then for any v > 0 we have 



dL{lJ-,i') < 2 


3 {mi,{u + iu))| du. 


Proof. Let Cy denote the Cauchy distribution with scale parameter v > 0. Recall that 
its Lebesgue density /„ is given by 



By the triangle inequality, 

dhilJ^^v) < dL {p,,yL-kCy) + dL {pLi^Cy,v-kCy) + dL ii',v*Cy ). (C.IO) 
Now observe that for rj = and any z = u + iv G C+, 



where /r;*c„ is the Lebesgue density of the convolution rj-kCy. Therefore, 

d^ [pL-k Cy,V -k Cy) < dx {pk Cy,V k Cy) 



As concerns dr {ri,r] k Cy), let X ~ 77 and Z ^ Ci he two independent random variables 
on a common probability space, whence X + vZ rj k Cy for any u > 0. Using the 
elementary tail inequalities 
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we obtain for any (5 > 0 and x G 


' {X < X - 8) <V {X + vZ < x) +¥ [ Z > -] <¥ {X + vZ < x) + 


1 V 

TT l5 


That is, 

¥ {X < X - 5) - 5 <V {X + vZ < x) 
whenever 5 > in which case we also have 


F{X + vZ < x) <F{X <x + S)+¥ { Z < --] <¥{X < x + S) + S. 


(C.12) and (C.13) imply 


dLir],r]-kCy) < \ r] = ^,v. 


Plugging (C.14) and (C.ll) into (C.IO) yields the claim. 


(C.12) 

(C.13) 

(C.14) 

□ 


Lemma C.12. Let fj,, v be two probability measures on the real line and m^, nrii, the 
corresponding Stieltjes transforms. Then for any z G C^, 


|to^(z) 


m^(z)\ < 2 


dsLi^,’^) 

(Sz)^ A ^z' 


(C.15) 


Proof. Note that 


X — z A' — , 


|A-A'| 


IA-A1 

l(A-^)(A'-z)r (Sz)2 


< 


i.e. 


\ /( 3 z)^ A 

A ^ / and A i-A 3 - — 

\ X — z J \ X — z 

are bounded by 1 in absolute value and 1-Lipschitz. This proves (C.15). 


□ 


Lemma C.13. Let (/i„)„gN and (vn)n^N be two sequences of probability measures on 
the Borel a-algebra on M. Assume that (/in)nGN *5 tight. Then 

dLi^Ln^ ^n) ^ 0 ^i5L(/^n, ^n) ^ 6- (C.16) 

Moreover, tightness 0 /(/i„)„gN o.nd (C.16) imply weak convergence 0 on the 

space of finite signed measures on IR. 


Proof. As concerns the equivalence relation, we need only to verify that 


^n) ^ 0 djQ^n) ^ 0 ; 


(C.17) 
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because < dsL (see, e.g. Huber (1974)). Assume that dL{tin,’^n) —>■ 0. Tightness 
of ifJ-n)n implies that any subsequence (fj,nk)k possesses a subsubsequence which 

converges weakly to a limiting probability measure /i, say. Consequently, as both, dsL 
and dL metrize weak convergence on the space of probability measures on K, 

dhi^t^nk^^ t 0 ^ 0. (C.18) 

By the triangle inequality, 

drik'ni,^ ) M) — 1 s) T ) ~^ 0, 

which in turn is equivalent to dBLi^'nk^^ d) 0- Again by the triangle inequality, 
dBLiliriknk'n^^ ) —>■ 0. This proves (C.17) and therefore the equivalence relation (C.16). 
As concerns the second statement, it is sufficient to show that any subsequence {nk)k pos¬ 
sesses a subsubsequence (nfe, )z with => 0. But this follows immediately from 

the above arguments, because for any subsequence (nk)k, there exist a subsubsequence 
(riki )i ^n.d a measure yt such that both, y, and Vnk^ ^ Mi hence ^0. □ 
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